Publications

ORCID: 0000-0002-4840-1095
Google Scholar

A Mathematical Theory of Correct Computation, Preprints, 2026.

Authors

Lee Naish, Bernard Pope, Harald Søndergaard

Journal

Preprints

Year

2026

DOI

10.20944/preprints202607.1206.v1

URL

https://www.preprints.org/manuscript/202607.1206/v1

Keywords

logic programming; functional programming; semantics; partial correctness; declarative debugging; specification; information ordering; complete lattice

Abstract

In 1970, Dana Scott proposed his highly influential "mathematical theory of computation" to define the relationship between the text of a program and what the program computes (or denotes) - the "semantics" of the program. Scott used a complete lattice based on the "information ordering", with the bottom element representing undefined - a program failing to terminate normally, thus producing no information. The top element, however, was unused. Hence most subsequent applications of denotational semantics have used mathematical structures that avoid top elements. We suggest that the information ordering is relevant not only to semanticists, but also to working programmers, as a basis for determining if a program component or a computation is correct according to their intentions. We also suggest that a return to the use of complete lattices is called for if we wish to broaden formal semantics to allow it to encompass programmer intentions. That is because often those intentions permit more than one runtime behaviour for a given input. For example, several declarative debugging tools allow a programmer to declare that a runtime call is "inadmissible", loosely meaning the called program component should never be used with such input. If the input is garbage, the programmer does not care what garbage is output - all results are acceptable, which can be modeled by the top element of a complete lattice. In this paper we explore the connections between the information ordering, correctness of computations and programs, and debugging. We present a general theory and describe several instances where the intention for what our logic/functional code computes plus what it actually computes can be described by elements in a complete lattice. This extends both the theoretical basis and practical flexibility of declarative debugging and reasoning about partial correctness and gives an attractive mathematical framework that encompasses our intentions, our programs and what they compute.

Lynch syndrome caused by a pathogenic SINE-VNTR-Alu (SVA) insertion in MSH2 gene identified by long-read DNA sequencing, Familial Cancer, 2026.

Authors

Jihoon E Joo, Khalid Mahmood, Mark Clendenning, Peter Georgeson, Romy Walker, Julia Como, Fiona Phillips, Bernard J Pope, Steven Batinovic, Natalie Diepenhorst, Julie McDonald, Toni Rice, Christophe Rosty, Mark A Jenkins, Finlay A Macrae, Ingrid M Winship, Hilda High, Daniel D Buchanan

Journal

Familial Cancer

Volume

25

Year

2026

DOI

10.1007/s10689-026-00588-7

Pubmed ID

42423801

URL

https://doi.org/10.1007/s10689-026-00588-7

Keywords

Lynch syndrome; DNA mismatch repair genes; Long-read sequencing; Oxford Nanopore Technologies; SINE-VNTR-Alu (SVA) insertion; Retrotransposon insertion; Structural variant

Abstract

Lynch syndrome, the most common hereditary cancer syndrome, is caused by germline pathogenic variants in DNA mismatch repair (MMR) genes. Identifying complex or structural MMR gene pathogenic variants can be challenging with short-read sequencing resulting in patients with unexplained MMR-deficient tumours. In this study, we report multiple members of a family who developed MSH2-deficient tumours where clinical multi-gene panel testing of the DNA MMR genes using short-read sequencing did not find a germline pathogenic variant. Oxford Nanopore Technologies adaptive sampling (ONT-AS) long-read sequencing, targeting 104 hereditary cancer genes, identified a shared ~ 3.2 kb SINE-VNTR-Alu (SVA) family F retrotransposon insertion within exon 12 of MSH2 in both the proband and her father. Segregation of this MSH2 SVA insertion by targeted PCR confirmed three additional family members as carriers, including a paternal uncle with three colorectal cancers and an MSH2-deficient sebaceous adenoma. Three of the five heterozygous carriers were cancer affected with at least one MSH2-deficient tumour each. This study demonstrates that long-read sequencing can identify structural variants that are missed by current short-read sequencing multi-gene panel testing, improving the diagnosis of Lynch syndrome. These findings support incorporating long-read sequencing into routine diagnostic workflows for patients with suspected Lynch syndrome following a negative germline test.

The Landscape of Prostate Tumour Methylation, Cancer Discovery, 2026.

Authors

Jaron Arbet, Takafumi N Yamaguchi, Yu-Jia Shiah, Rupert Hugh-White, Adriana Wiggins, Jieun Oh, Nicole Zeltser, Tsumugi Gebo, Adrien Foucal, Robert Lesurf, Chol-Hee Jung, Rachel M A Dang, Raag Agrawal, Julie Livingstone, Adriana Salcedo, Cindy Q Yao, Shadrielle Melijah G Espiritu, Kathleen Houlahan, Fouad Yousif, Lawrence E Heisler, Anthony T Papenfuss, Michael Fraser, Bernard J Pope, Amar U Kishan, Alejandro Berlin, Melvin L K Chua, Niall M Corcoran, Theodorus van der Kwast, Christopher M Hovens, Robert G Bristow, Paul C Boutros

Journal

Cancer Discovery

Year

2026

DOI

10.1158/2159-8290.CD-25-0761

Pubmed ID

42307031

URL

https://doi.org/10.1158/2159-8290.CD-25-0761

Abstract

Prostate cancer is characterized by profound clinical and molecular heterogeneity. While its genomic heterogeneity is well-characterized, its epigenomic heterogeneity remains less understood. We therefore created a compendium of 3,001 multi-ancestry prostate methylomes spanning normal tissue through localized disease of all grades to poly-metastatic disease. A subset of 884 samples had multi-omic DNA and/or RNA characterization. We identify four epigenomic subtypes that risk-stratify patients and reflect distinct evolutionary trajectories. We demonstrate extensive regulatory interplay between DNA copy number and methylation, with transcriptional consequences that vary across genes and disease stages. We define epigenetic dysregulation signatures for 15 important clinico-molecular features, creating predictive models for each. For example, we identify specific epigenetic features that predict patient outcome and are synergistic with clinical prognostic features. These results define a complex interplay between tumour genetics and epigenetics that converges to modify gene-expression programs and clinical presentation, in part through modulation of epigenetic aging.

A mosaic of genomic architectures underpins parasitism loss in a jawless vertebrate, bioRxiv, 2026.

Authors

Arne Jacobs, Nolwenn Decanter, Ole K Tørresen, Benedicte Garmann-Aarhus, Maria Capstick, Quentin Rougemont, Frédéric Guillaume, Romane Normand, Julien Tremblay, Jean-Pierre Destouches, Anne-Laure Besnard, Ahmed Souissi, Gilles Lassalle, Solenn Stoeckel, Eric J Petit, Siv NK Hoff, Daniel J Park, Bernard J Pope, Sissel Jentoft, Leif Asbjørn Vøllestad, Kjetill S Jakobsen, Guillaume Evanno

Journal

bioRxiv

Year

2026

DOI

10.64898/2026.05.11.724254

URL

https://doi.org/10.64898/2026.05.11.724254

Abstract

Lampreys are the only ancestrally parasitic vertebrate lineage, yet parasitism has been repeatedly lost alongside a suite of life-history changes, such as loss of migration and juvenile feeding and accelerated maturation. Combining whole-genome resequencing, haplotype-resolved assemblies, hybrid-zone genotyping, multi-tissue transcriptomics, and sperm phenotyping, we map this life-history syndrome in European Lampetra to six chromosomes spanning a mosaic of genomic architectures: a ∼20 Mb low-recombination region on chromosome 1 lacking chromosomal rearrangements within Lampetra but involving inter-specific rearrangements across deep lamprey lineages; a translocated inversion with ecotype-dependent sperm-velocity effects; and ecotype-divergent deletions overlapping genes crucial for nervous system (CNTNAP2) and reproductive development (FSHR). However, this genomic basis is not shared with a convergent sister lineage, pointing to independent routes to a recurring life-history transition in lampreys.

Cancer Risks in First-Degree Relatives of Individuals with Biallelic Somatic DNA Mismatch Repair Mutations, International Journal of Cancer, 2026.

Authors

Romy Walker, James G Dowty, Hsin-Lun Liao, Mark Clendenning, Jihoon E Joo, Khalid Mahmood, Peter Georgeson, Sharelle Joseland, Fiona Phillips, Julia Como, Susan G Preston, Bernard J Pope, Aung K Win, Christophe Rosty, Finlay A Macrae, Ingrid M Winship, Mark A Jenkins, Daniel D Buchanan

Journal

International Journal of Cancer

Volume

158

Year

2026

Abstract

Conference proceedings: 9th Meeting of the European-Hereditary-Tumour-Group (EHTG), Heidelberg Germany, Sep 19-21, 2025

Pathogenic variants reveal candidate genes for prostate cancer germline testing for men of African ancestry, Nature Communications, 2025.

Authors

Kazzem Gheybi, Pamela X. Y. Soh, Jue Jiang, Tumisang M. N. Mbeki, Melanie Louw, Daniel Burns, Piyushkumar Mundra, Daria Kiriy, Md. Mehedi Hasan, Weerachai Jaratlerdsiri, Maphuti Tebogo Lebelo, Raymond A. Campbell, Mulalo B. Radzuma, Mukudeni Nenzhelele, Muvhulawa Obida, Martin Obida, Winstar M. Ombuki, Micah O. Oyaro, Sean M. Patrick, Massimo Loda, David C. Wedge, Robert G. Bristow, Daniel S. Brewer, Colin S. Cooper, Jüri Reimand, Geraldine Cancel-Tassin, Olivier Cussenot, Chris M. Hovens, Niall M. Cocoran, Phillip D. Stricker, Thorsten Schlomm, Gail S. Prins, Karina Dalsgaard Sørensen, Pan Prostate Cancer Group, HEROIC PCaPH Africa1K, Joachim Weischenfeldt, Shingai B. A. Mutambirwa, Peter M. Ngugi, David M. Thomas, Zsofia Kote-Jarai, Rosalind A. Eeles, M. S. Riana Bornman, Vanessa M. Hayes

Consortium authors

Rosalind A. Eeles, Colin S. Cooper, G. Steven S. Bova, Daniel S. Brewer, Robert G. Bristow, Mark N. Brook, Benedict Brors, Daniel Burns, Adam Butler, Geraldine Cancel-Tassin, Kevin C. L. Cheng, Niall M. Corcoran, Olivier Cussenot, Francesco Favero, Clarissa Gerhauser, Abraham Gihawi, Etsehiwot G. Girma, Vincent J. Gnanapragasam, Andreas J. Gruber, Anis Hamid, Vanessa M. Hayes, Housheng Hansen He, Chris M. Hovens, Eddie Luidy Imada, G. Maria Jakobsdottir, Weerachai Jaratlersiri, Jue Jiang, Chol-Hee Jung, Francesca Khani, Daria Kiriy, Zsofia Kote-Jarai, Philippe Lamy, Gregory Leeman, Massimo Loda, Pavlo Lutsik, Luigi Marchionni, Ramyar Molania, Anthony T. Papenfuss, Diogo Pellegrina, Bernard Pope, Lucio R. Queiroz, Tobias Rausch, Jüri Reimand, Brain Robinson, Atef Sahli, Thorsten Schlomm, Pamela X. Y. Soh, Karina Dalsgaard Sørensen, Sebastian Uhrig, David C. Wedge, Joachim Weischenfeldt, Yaobo Xu, Takafumi N. Yamaguchi, Claudio Zanettini, Vanessa M. Hayes, M. S. Riana Bornman, Peter Mungai Ngugi, Gail S. Prins, Weerachai Jaratlersiri, Winstar M. Ombuki, Sean M. Patrick, Daniel M. Moreira, Ikenna C. Madueke, Maria Argos, Irene E. J. Barnhoorn, Lynn Birch, Daniel S. Brewer, Robert G. Bristow, Raymond A. Campbell, Colin S. Cooper, Jenna Craddock, Rosalind A. Eeles, G. Nicolo’ Fanelli, Eva Ferlev Jensby, Hagen E. A. Förtsch, Jessie Gamxamub, Kazzem Gheybi, Abraham Gihawi, Tingting Gong, Md. Mehedi Hasan, Vivien Holmes, Ruotian Huang, Jue Jiang, Zsofia Kote-Jarai, Maphuti Tebogo Lebelo, Massimo Loda, Melanie Louw, Pavlo Lutsik, Umuna Maendo, Tumisang M. N. Mbeki, Reginald Menoe, Shingai B. A. Mutambirwa, Muriuki Elias Nyaga, Micah O. Oyaro, Willis Oyieko, Joyce Shirinde, Pamela X. Y. Soh, Golda Stellmacher, Avraam Tapinos, Korawich Uthayopas, Douglas I. Walker, Edwin O. O. Walong, Githui Sheila Wanjiku, David C. Wedge, Allan Yienya, Kangping Zhou

Journal

Nature Communications

Volume

16

Issue

8799

Year

2025

DOI

10.1038/s41467-025-63865-6

Pubmed ID

39945744

URL

https://doi.org/10.1038/s41467-025-63865-6

Abstract

Prostate cancer (PCa) germline testing, while gaining momentum, is ancestry restrictive and African exclusive. Through whole genome sequencing for 217 African ancestral cases (186 southern African, 31 Pan representative), we identify 172 potentially pathogenic variants in 78 DNA damage repair or PCa related genes. Prevalence for reported (13/217, 5.99%) and cumulative predicted (24/217, 11.06%) variants of significance (11 genes) falls below that reported for non-Africans. Conversely, BRCA1, HOXB13, CDK12, MLH1, MSH2, and BRIP1 remain unimpacted. Through pathogenic ranking based on variant frequency and functionality, clinical presentation and tumour-matched biallelic inactivation, top-ranked candidates include PREX2, POLE, FAT1, BRCA2, POLQ, LRP1B and ATM. Besides notable impact of DNA polymerases, including POLG, Fanconi anaemia genes include FANCD2, FANCA, FANCG, ERCC4, FANCE and FANCI, while DNA mismatch repair genes MSH3 and PMS1 outranked known namesakes MSH6 and PMS2. This study provides insights into the spectrum of African-relevant potentially pathogenic PCa variants, highlighting much-needed gene candidates for ancestry-inclusive germline testing.

The Germline and Somatic Origins of Prostate Cancer Heterogeneity, Cancer Discovery, 2025.

Authors

Takafumi N Yamaguchi, Kathleen E Houlahan, Helen Zhu, Natalie Kurganovs, Julie Livingstone, Natalie S Fox, Jiapei Yuan, Jocelyn Sietsma Penington, Chol-Hee Jung, Tommer Schwarz, Weerachai Jaratlerdsiri, Job van Riet, Peter Georgeson, Stefano Mangiola, Kodi Taraszka, Robert Lesurf, Jue Jiang, Ken Chow, Lawrence E Heisler, Yu-Jia Shiah, Susmita G Ramanand, Michael J Clarkson, Anne Nguyen, Shadrielle Melijah G Espiritu, Ryan Stuchbery, Richard Jovelin, Vincent Huang, Connor Bell, Edward O'Connor, Patrick J McCoy, Christopher M Lalansingh, Marek Cmero, Adriana Salcedo, Eva K F Chan, Lydia Y Liu, Phillip D Stricker, Vinayak Bhandari, Riana M S Bornman, Dorota Hs Sendorek, Andrew Lonie, Stephenie D Prokopec, Michael Fraser, Justin S Peters, Adrien Foucal, Shingai B A Mutambirwa, Lachlan Mcintosh, Michèle Orain, Matthew Wakefield, Valérie Picard, Daniel J Park, Hélène Hovington, Michael Kerger, Alain Bergeron, Veronica Sabelnykova, Ji-Heui Seo, Mark M Pomerantz, Noah Zaitlen, Sebastian M Waszak, Alexander Gusev, Louis Lacombe, Yves Fradet, Andrew Ryan, Amar U Kishan, Martijn P Lolkema, Joachim Weischenfeldt, Bernard Têtu, Anthony J Costello, Vanessa M Hayes, Rayjean J Hung, Housheng H He, John D McPherson, Bogdan Pasaniuc, Theodorus van der Kwast, Anthony T Papenfuss, Matthew L Freedman, Bernard J Pope, Robert G Bristow, Ram S Mani, Niall M Corcoran, Jüri Reimand, Christopher M Hovens, Paul C Boutros

Journal

Cancer Discovery

Year

2025

DOI

10.1158/2159-8290.CD-23-0882

Pubmed ID

39945744

URL

https://doi.org/10.1158/2159-8290.CD-23-0882

Abstract

Newly diagnosed prostate cancers differ dramatically in mutational composition and lethality. The most accurate clinical predictor of lethality is tumor tissue architecture, quantified as tumor grade. To interrogate the evolutionary origins of prostate cancer heterogeneity, we analyzed 666 prostate tumor whole genomes. We identified a compendium of 223 recurrently mutated driver regions, most influencing downstream mutational processes and gene expression. We identified and validated individual germline variants that predispose tumors to acquire specific somatic driver mutations: these explain heterogeneity in disease presentation and ancestry differences. High-grade tumors have a superset of the drivers in lower-grade tumors, including increased frequency of BRCA2 and MYC mutations. Grade-associated driver mutations occur early in tumor evolution, and their earlier occurrence strongly predicts cancer relapse and metastasis. Our data suggest high- and low-grade prostate tumors both emerge from a common pre-malignant field, influenced by germline genomic context and stochastic mutation-timing.

Adenomas from individuals with pathogenic biallelic variants in the MUTYH and NTHL1 genes demonstrate base excision repair tumour mutational signature profiles similar to colorectal cancers, expanding potential diagnostic and variant classification applications, Translational Oncology, 2025.

Authors

Romy Walker, Jihoon E. Joo, Khalid Mahmood, Mark Clendenning, Julia Como, Susan G. Preston, Sharelle Joseland, Bernard J. Pope, Ana B.D. Medeiros, Brenely V. Murillo, Nicholas Pachter, Kevin Sweet, Allan D. Spigelman, Alexandra Groves, Margaret Gleeson, Krzysztof Bernatowicz, Nicola Poplawski, Lesley Andrews, Emma Healey, Steven Gallinger, Robert C. Grant, Aung K. Win, John L. Hopper, Mark A. Jenkins, Giovana T. Torrezan, Christophe Rosty, Finlay A. Macrae, Ingrid M. Winship, Daniel D. Buchanan, Peter Georgeson

Journal

Translational Oncology

Volume

52

Year

2025

DOI

10.1016/j.tranon.2024.102266

URL

https://doi.org/10.1016/j.tranon.2024.102266

Keywords

Mutational signature, Adenoma SBS36, SBS30, MUTYH, NTHL1, Variant of uncertain clinical significance

Abstract

Background
Colorectal cancers (CRCs) from people with biallelic germline likely pathogenic/pathogenic variants in MUTYH or NTHL1 exhibit specific single base substitution (SBS) mutational signatures, namely combined SBS18 and SBS36 (SBS18+SBS36), and SBS30, respectively. The aim was to determine if adenomas from biallelic cases demonstrated these mutational signatures at diagnostic levels.
Methods
Whole-exome sequencing of FFPE tissue and matched blood-derived DNA was performed on 9 adenomas and 15 CRCs from 13 biallelic MUTYH cases, on 7 adenomas and 2 CRCs from 5 biallelic NTHL1 cases and on 27 adenomas and 26 CRCs from 46 non-hereditary (sporadic) participants. All samples were assessed for COSMIC v3.2 SBS mutational signatures.
Results
In biallelic MUTYH cases, SBS18+SBS36 signature proportions in adenomas (mean±standard deviation, 65.6 %±29.6 %) were not significantly different to those observed in CRCs (76.2 % ± 20.5 %, p-value=0.37), but were significantly higher compared with non-hereditary adenomas (7.6 % ± 7.0 %, p-value=3.4 × 10–4). Similarly, in biallelic NTHL1 cases, SBS30 signature proportions in adenomas (74.5 %±9.4 %) were similar to those in CRCs (78.8 % ± 2.4 %) but significantly higher compared with non-hereditary adenomas (2.8 % ± 3.6 %, p-value=5.1 × 10–7). Additionally, a compound heterozygote with the c.1187G>A p.(Gly396Asp) pathogenic variant and the c.533G>C p.(Gly178Ala) variant of unknown significance (VUS) in MUTYH demonstrated high levels of SBS18+SBS36 in four adenomas and one CRC, providing evidence for reclassification of the VUS to pathogenic.
Conclusions
SBS18+SBS36 and SBS30 were enriched in adenomas at comparable proportions to those observed in CRCs from biallelic MUTYH and biallelic NTHL1 cases, respectively. Therefore, testing adenomas may improve the identification of biallelic cases and facilitate variant classification, ultimately enabling opportunities for CRC prevention.

DNA mismatch repair (MMR) gene mosaicism is rare in people with MMR-deficient cancers, Gastroenterology, 2025.

Authors

Romy Walker, Jihoon E. Joo, Khalid Mahmood, Peter Georgeson, The ANGELS-CCFR-Muir-Torre Consortium, Ingrid M. Winship, Daniel D. Buchanan

Consortium authors

Mark Clendenning, Sharelle Joseland, Julia Como, Susan G. Preston, Sarah Stoss, Christophe Rosty, Bernard J. Pope, Finlay A. Macrae, Aung K. Win, John L. Hopper, Mark A. Jenkins, John D. Potter, N. Jewel Samadder, and Michael D. Walsh

Journal

Gastroenterology

Year

2025

DOI

10.1053/j.gastro.2024.12.027

URL

https://doi.org/10.1053/j.gastro.2024.12.027

Reproductive modes in populations of late-acting self-incompatible and self-compatible polyploid Ludwigia grandiflora subsp. hexapetala in western Europe, Peer Community Journal, 2024.

Authors

Solenn Stoeckel, Ronan Becheler, Luis Portillo-Lemus, Marilyne Harang, Anne-Laure Besnard, Gilles Lassalle, Romain Causse-Védrines, Sophie Michon-Coudouel, Daniel Park, Bernard Pope, Eric Petit, Dominique Barloy

Journal

Peer Community Journal

Volume

4

Year

2024

DOI

10.24072/pcjournal.458

URL

https://doi.org/10.24072/pcjournal.458

Abstract

Reproductive mode, i.e., the proportion of individuals produced by clonality, selfing and outcrossing in populations, determines how hereditary material is transmitted through generations. It shapes genetic diversity and its structure over time and space, which can be used to infer reproductive modes. Ludwigia grandiflora subsp. hexapetala (Lgh) is a partially clonal, polyploid, hermaphroditic, and heteromorphic plant that recently colonized multiple countries worldwide. In western Europe, individuals are either self-incompatible caused by a late-acting self-incompatibility (LSI) system developing long-styled flowers, or self-compatible (SC), with short-styled flowers. In this study, we genotyped 53 long- and short-styled populations newly colonizing France and northern Spain using SNPs to estimate rates of clonality, selfing and outcrossing. We found that populations reproduced mainly clonally but with a high diversity of genotypes along with rates of sexuality ranging from 10% up to 40%. We also found evidence for local admixture between long- and short-styled populations in a background of genetic structure between floral morphs that was twice the level found within morphs. Long- and short-styled populations showed similar rates of clonality, but short-styled populations presented significantly higher rates of selfing, as expected considering their breeding system and despite the small rates of failure of the LSI system. Within the 53 studied populations, the 13 short-styled populations had fewer effective alleles, lower observed heterozygosity, and higher inbreeding coefficients, linkage disequilibrium and estimates of selfing than what was found in long-styled populations. These results emphasize the necessity to consider the variation of reproductive modes when managing invasive plant species. The overall maintenance of higher genetic diversity with the possibility of maintaining populations clonally in the absence of compatible partners may explain why long-styled individuals seem to be more prevalent in all newly expanding populations worldwide. Beyond Lgh, our methodological approach may inspire future studies to assess the reproductive modes in other autopolyploid populations.

The multi-omic landscape of somatic variation across a pan-cancer cohort of brain metastases, Neuro-Oncology Advances, 2024.

Authors

Grace Hall, Bethany Campbell, Stanley Stylli, Chol-Hee Jung, Bernard Pope, Daniel Park, Ramyar Molania, Justin Bedo, Jennifer Ureta, Silvia Rodrigues, Caroline Fidalgo-Ribeiro, Massimo Loda, Patrick McCoy, Clint Gray, Swee Tan, Agadha Wickremesekera, Christobel Saunders, Kate Drummond, Niall Corcoran, James Dimou, Tony Papenfuss, Christopher Hovens

Journal

Neuro-Oncology Advances

Volume

6

Year

2024

DOI

10.1093/noajnl/vdae090.004

URL

https://doi.org/10.1093/noajnl/vdae090.004

Abstract

INTRODUCTION. Brain metastases pose a significant and growing challenge in clinical practice, yet the molecular variants of this tumor type has not previously been examined comprehensively across multiple cancer types. In particular, there has been limited focus on contrasting the genetic profiles of common brain metastases originating from primary sites like lung, breast, melanoma, colorectal, and renal cancers, with those arising infrequently such as prostate and others. METHODS. We have prospectively collected over 125 fresh frozen brain metastases and matching germline reference samples from 14 different primary tumor types and to date have performed WGS, RNA-Seq and Epic 850k methylation profiling on 60 of these samples spanning both common and rare brain metastases. We have employed machine learning algorithms, network analysis techniques, and integrative bioinformatics pipelines to extract meaningful insights into the biological underpinnings of brain metastasis formation. RESULTS. We conducted an in-depth investigation into the genomic, transcriptomic, and epigenomic profiles of brain metastases, spanning both common and rare types. By merging these diverse datasets, we identified distinct molecular modifications linked to brain metastases with low incidence rates compared to those more frequently observed. Notably, we observed heightened alterations in the regulation of Golgi dynamics, sensing of lipid species, chromatin remodeling factors, and cytoskeletal remodeling in rare brain metastases. These changes likely influence cell migration and invasion dynamics, elucidating the potential for unique characteristics of less common brain metastasis types. CONCLUSIONS. This research indicates that the molecular mechanisms driving the formation of brain metastases may vary depending on whether the primary malignancy frequently or infrequently spreads to the brain. Studies such as these, will aid in eventually driving innovations in precision oncology and ultimately improving patient outcomes.

Prevalence of Germline Pathogenic Variants in Renal Cancer Predisposition Genes in a Population-Based Study of Renal Cell Carcinoma, Cancers, 2024.

Authors

Fiona Bruinsma, Philip Harraka, Susan Jordan, Daniel J. Park, Bernard Pope, Jason Steen, Roger L. Milne, Graham G. Giles, Ingrid Winship, Katherine M. Tucker, Melissa C. Southey, Nguyen-Dumont

Journal

Cancers

Volume

16

Issue

17

Year

2024

DOI

10.3390/cancers16172985

URL

https://doi.org/10.3390/cancers16172985

Abstract

Renal cell carcinoma (RCC) has been associated with germline pathogenic or likely pathogenic (PLP) variants in recognised cancer susceptibility genes. Studies of RCC using gene panel sequencing have been highly variable in terms of study design, genes included, and reported prevalence of PLP variant carriers (4–26%). Studies that restricted their analysis to established RCC predisposition genes identified variants in 1–6% of cases. This work assessed the prevalence of clinically actionable PLP variants in renal cancer predisposition genes in an Australian population-based sample of RCC cases. Germline DNA from 1029 individuals diagnosed with RCC who were recruited through the Victoria and Queensland cancer registries were screened using a custom amplicon-based panel of 21 genes. Mean age at cancer diagnosis was 60 ± 10 years, and two-thirds (690, 67%) of the participants were men. Eighteen participants (1.7%) were found to carry a PLP variant. Genes with PLP variants included BAP1, FH, FLCN, MITF, MSH6, SDHB, TSC1, and VHL. Most carriers of PLP variants did not report a family history of the disease. Further exploration of the clinical utility of gene panel susceptibility testing for all RCCs is warranted.

Genomic evolution shapes prostate cancer disease type, Cell Genomics, 2024.

Authors

Dan J Woodcock, Atef Sahli, Ruxandra Teslo, Vinayak Bhandari, Andreas J Gruber, Aleksandra Ziubroniewicz, Gunes Gundem, Yaobo Xu, Adam Butler, Ezequiel Anokian, Bernard J Pope, Chol-Hee Jung, Maxime Tarabichi, Stefan C Dentro, J Henry R Farmery, CRUK ICGC Prostate Group, Peter Van Loo, Anne Y Warren, Vincent Gnanapragasam, Freddie C Hamdy, G Steven Bova, Christopher S Foster, David E Neal, Yong-Jie Lu, Zsofia Kote-Jarai, Michael Fraser, Robert G Bristow, Paul C Boutros, Anthony J Costello, Niall M Corcoran, Christopher M Hovens, Charlie E Massie, Andy G Lynch, Daniel S Brewer, Rosalind A Eeles, Colin S Cooper, David C Wedge

Journal

Cell Genomics

Year

2024

DOI

10.1016/j.xgen.2024.100511

Pubmed ID

38428419

URL

https://doi.org/10.1016/j.xgen.2024.100511

Keywords

AR binding; cancer evolution; evotype model; evotypes; ordering; prostate cancer

Abstract

The development of cancer is an evolutionary process involving the sequential acquisition of genetic alterations that disrupt normal biological processes, enabling tumor cells to rapidly proliferate and eventually invade and metastasize to other tissues. We investigated the genomic evolution of prostate cancer through the application of three separate classification methods, each designed to investigate a different aspect of tumor evolution. Integrating the results revealed the existence of two distinct types of prostate cancer that arise from divergent evolutionary trajectories, designated as the Canonical and Alternative evolutionary disease types. We therefore propose the evotype model for prostate cancer evolution wherein Alternative-evotype tumors diverge from those of the Canonical-evotype through the stochastic accumulation of genetic alterations associated with disruptions to androgen receptor DNA binding. Our model unifies many previous molecular observations, providing a powerful new framework to investigate prostate cancer disease progression.

A Family-Based Study of Inherited Genetic Risk in Lipedema, Lymphatic Research and Biology, 2024.

Authors

Steven Morgan, Isabella Reid, Charlotte Bendon, Musarat Ishaq, Ramin Shayan, Bernard Pope, Daniel Park, Tara Karnezis

Journal

Lymphatic Research and Biology

Year

2024

DOI

10.1089/lrb.2023.0065

Pubmed ID

38407896

URL

https://doi.org/10.1089/lrb.2023.0065

Keywords

family study; genetic risk; lipedema

Abstract

Background: Lipedema is a progressive condition involving excessive deposition of subcutaneous adipose tissue, predominantly in the lower limbs, which severely compromises quality of life. Despite the impact of lipedema, its molecular and genetic bases are poorly understood, making diagnosis and treatment difficult. Historical evaluation of individuals with lipedema indicates a positive family history in 60%-80% of cases; however, genetic investigation of larger family cohorts is required. Here, we report the largest family-based sequencing study to date, aimed at identifying genetic changes that contribute to lipedema. Methods and Results: DNA samples from 31 individuals from 9 lipedema families were analyzed to reveal genetic variants predicted to alter protein function, yielding candidate variants in 469 genes. We did not identify any individual genes that contained likely disease-causing variants across all participating families. However, gene ontology analysis highlighted vasopressin receptor activity, microfibril binding, and patched binding as statistically significantly overrepresented categories for the set of candidate variants. Conclusions: Our study suggests that lipedema is not caused by a single exomic genetic factor, providing support for the hypothesis of genetic heterogeneity in the etiology of lipedema. As the largest study of its kind in the lipedema field, the results advance our understanding of the disease and provide a roadmap for future research aimed at improving the lives of those affected by lipedema.

Ultrasensitive Detection of Circulating Tumour DNA enriches for Patients with a Greater Risk of Recurrence of Clinically Localised Prostate Cancer, European Urology, 2024.

Authors

Bernard Pope, Gahee Park, Edmund Lau, Jelena Belic, Radoslaw Lach, Anne George, Patrick McCoy, Anne Nguyen, Corrina Grima, Bethany Campbell, Chol-hee Jung, Emma-Jane Ditter, Hui Zhao, The Pan Prostate Cancer Group (PPCG), David C. Wedge, Daniel S. Brewer, Andy G. Lynch, Harveer Dev, Vincent J. Gnanpragasam, Nitzan Rosenfeld, Christopher M. Hovens, Niall M. Corcoran, Charles E. Massie

Journal

European Urology

Volume

85

Issue

4

Year

2024

DOI

10.1016/j.eururo.2024.01.002

Pubmed ID

38378299

URL

https://doi.org/10.1016/j.eururo.2024.01.002

Abstract

Circulating tumour DNA (ctDNA) has clinical applications as a “liquid biopsy” owing to its short half-life, noninvasive collection modalities, and propensity for sampling across populations of tumour cells. While successful ctDNA detection has been demonstrated in metastatic prostate cancer, localised disease yields low levels of ctDNA, making detection difficult using conventional methods. In this study, we assessed ctDNA detection in localised prostate cancer using the high-sensitivity integration of variant reads (INVAR) method and tested the hypothesis that the presence of ctDNA is associated with high-risk disease. Tumour information was used to create marker panels of patient-specific mutations that were used to identify ctDNA molecules in patient-matched plasma samples. After initial background error calculations and filtering steps, the detected ctDNA fraction was estimated as a global integrated mutant allele fraction (IMAF) using the background-subtracted mean allele fraction across the patient-specific loci in each sample. Patient-specific scores were compared to the threshold value for classification, which was calculated using data from control samples and set to 95% specificity.

Inherited BRCA1 and RNF43 pathogenic variants in a familial colorectal cancer type X family, Familial Cancer, 2023.

Authors

James M. Chan, Mark Clendenning, Sharelle Joseland, Peter Georgeson, Khalid Mahmood, Jihoon E. Joo, Romy Walker, Julia Como, Susan Preston, Shuyi Marci Chai, Yen Lin Chu, Aaron L. Meyers, Bernard J. Pope, David Duggan, J. Lynn Fink, Finlay A. Macrae, Christophe Rosty, Ingrid M. Winship, Mark A. Jenkins & Daniel D. Buchanan

Journal

Familial Cancer

Year

2023

DOI

10.1007/s10689-023-00351-2

Pubmed ID

38063999

URL

https://doi.org/10.1007/s10689-023-00351-2

Keywords

BRCA1; Colorectal cancer; Digenic inheritance; FCCTX; Germline pathogenic variant; RNF43; Serrated polyposis syndrome

Abstract

Genetic susceptibility to familial colorectal cancer (CRC), including for individuals classified as Familial Colorectal Cancer Type X (FCCTX), remains poorly understood. We describe a multi-generation CRC-affected family segregating pathogenic variants in both BRCA1, a gene associated with breast and ovarian cancer and RNF43, a gene associated with Serrated Polyposis Syndrome (SPS). A single family out of 105 families meeting the criteria for FCCTX (Amsterdam I family history criteria with mismatch repair (MMR)-proficient CRCs) recruited to the Australasian Colorectal Cancer Family Registry (ACCFR; 1998–2008) that underwent whole exome sequencing (WES), was selected for further testing. CRC and polyp tissue from four carriers were molecularly characterized including a single CRC that underwent WES to determine tumor mutational signatures and loss of heterozygosity (LOH) events. Ten carriers of a germline pathogenic variant BRCA1:c.2681_2682delAA p.Lys894ThrfsTer8 and eight carriers of a germline pathogenic variant RNF43:c.988 C > T p.Arg330Ter were identified in this family. Seven members carried both variants, four of which developed CRC. A single carrier of the RNF43 variant met the 2019 World Health Organization (WHO2019) criteria for SPS, developing a BRAF p.V600 wildtype CRC. Loss of the wildtype allele for both BRCA1 and RNF43 variants was observed in three CRC tumors while a LOH event across chromosome 17q encompassing both genes was observed in a CRC. Tumor mutational signature analysis identified the homologous recombination deficiency (HRD)-associated COSMIC signatures SBS3 and ID6 in a CRC for a carrier of both variants. Our findings show digenic inheritance of pathogenic variants in BRCA1 and RNF43 segregating with CRC in a FCCTX family. LOH and evidence of BRCA1-associated HRD supports the importance of both these tumor suppressor genes in CRC tumorigenesis.

DNA Mismatch Repair Gene Variant Classification: Evaluating the Utility of Somatic Mutations and Mismatch Repair Deficient Colonic Crypts and Endometrial Glands, Cancers, 2023.

Authors

Romy Walker, Khalid Mahmood, Julia Como, Mark Clendenning, Jihoon E. Joo, Peter Georgeson, Sharelle Joseland, Susan G. Preston, Bernard J. Pope, James M. Chan, Rachel Austin, Jasmina Bojadzieva, Ainsley Campbell, Emma Edwards, Margaret Gleeson, Annabel Goodwin, Marion T. Harris, Emilia Ip, Judy Kirk, Julia Mansour, Helen Mar Fan, Cassandra Nichols, Nicholas Pachter, Abiramy Ragunathan, Allan Spigelman, Rachel Susman, Michael Christie, Mark A. Jenkins, Rish K. Pai, Christophe Rosty, Finlay A. Macrae, Ingrid M. Winship, Daniel D. Buchanan

Journal

Cancers

Volume

15

Issue

20

Year

2023

DOI

10.3390/cancers15204925

URL

https://doi.org/10.3390/cancers15204925

Keywords

Lynch syndrome; DNA mismatch repair gene variant classification; DNA mismatch repair deficient crypts/glands; colorectal cancer; endometrial cancer; variant of uncertain significance; DNA mismatch repair gene somatic mutations

Abstract

Germline pathogenic variants in the DNA mismatch repair (MMR) genes (Lynch syndrome) predispose to colorectal (CRC) and endometrial (EC) cancer. Lynch syndrome specific tumor features were evaluated for their ability to support the ACMG/InSiGHT framework in classifying variants of uncertain clinical significance (VUS) in the MMR genes. Twenty-eight CRC or EC tumors from 25 VUS carriers (6xMLH1, 9xMSH2, 6xMSH6, 4xPMS2), underwent targeted tumor sequencing for the presence of microsatellite instability/MMR-deficiency (MSI-H/dMMR) status and identification of a somatic MMR mutation (second hit). Immunohistochemical testing for the presence of dMMR crypts/glands in normal tissue was also performed. The ACMG/InSiGHT framework reclassified 7/25 (28%) VUS to likely pathogenic (LP), three (12%) to benign/likely benign, and 15 (60%) VUS remained unchanged. For the seven re-classified LP variants comprising nine tumors, tumor sequencing confirmed MSI-H/dMMR (8/9, 88.9%) and a second hit (7/9, 77.8%). Of these LP reclassified variants where normal tissue was available, the presence of a dMMR crypt/gland was found in 2/4 (50%). Furthermore, a dMMR endometrial gland in a carrier of an MSH2 exon 1-6 duplication provides further support for an upgrade of this VUS to LP. Our study confirmed that identifying these Lynch syndrome features can improve MMR variant classification, enabling optimal clinical care.

Abstract PR012: Ultra-sensitive detection of circulating tumour DNA enriches for patients with higher risk disease in clinically localised prostate cancer. Proceedings of the AACR Special Conference: Advances in Prostate Cancer Research; 2023 Mar 15-18; Denver, Colorado, Cancer Research, 2023.

Authors

Bernard J. Pope, Gahee Park, Edmund Lau, Jelena Belic, Radoslaw Lach, Anne George, Patrick McCoy, Anne Nguyen, Corrina Grima, Chol-hee Jung, Emma-Jane Ditter, Hui Zhao, David Wedge, Rosalind A Eeles, Daniel Brewer, Andy G. Lynch, Harveer Dev, Christopher M Hovens, Vincent J. Gnanpragasam, Nitzan Rosenfeld, Niall M. Corcoran, Charles E. Massie

Journal

Cancer Research

Volume

83

Issue

11_Supplement

Year

2023

DOI

10.1158/1538-7445.PRCA2023-PR012

URL

https://doi.org/10.1158/1538-7445.PRCA2023-PR012

Abstract

Purpose Circulating tumour DNA (ctDNA) analysis has demonstrated utility for diagnostic and prognostic applications in many cancer types, including metastatic prostate cancer. However, localised prostate cancer yields relatively low levels of ctDNA and therefore it has been difficult to detect in this context using conventional methods. We aimed to assess the limits of detection of ctDNA in localised prostate cancer by leveraging thousands of patient-specific mutations per case with the high-sensitivity INVAR method, and to test the hypothesis that ctDNA detection is associated with high risk disease. Experimental Procedures A total of 128 individuals with clinically localised prostate cancer at the time of sample collection were selected from cohorts in Australia (n=48) and the UK (n=80). Additionally, 27 healthy individuals without prostate cancer were included as negative controls. Primary tumour tissue and matched bloods from all cases were whole genome sequenced (WGS) and somatic variants were called using pipelines from the Pan Prostate Cancer Group. Plasma cell-free DNA (cfDNA) samples from cases and controls were profiled using custom targeted sequencing panels, with saturating coverage of patient-specific mutations identified by WGS. We assessed ctDNA detection in cases using the highly sensitive INVAR pipeline, that leverages consensus read sequencing alignments, background error modelling and integration of signals across thousands of patient-specific variants. Biochemical recurrence and metastasis-free survival curves were used to assess the relationship between ctDNA detection and disease progression. Results We analysed pre-treatment blood plasma ctDNA in a large cohort of localised prostate cancer patients, using error-suppressed targeted sequencing of over 280k patient-specific mutations. To comprehensively assess ctDNA mutation analysis in this context, we combined signals across the maximum number of genome-wide patient specific mutations and leveraged an established analysis pipeline (INVAR) that corrects for background error rates and calculates a global integrated mutant allele fraction. ctDNA was detected in 9.3% of localised prostate cancer patients. In cases where ctDNA was detected we found significant associations with biochemical recurrence (p=0.01) and shorter metastasis-free survival (p < 0.0001). Conclusions Our study provides clear insights into the required analytical sensitivity and potential utility of ctDNA mutation analysis in localised prostate cancer. We found that mutation-based ctDNA detection rates were low in localised prostate cancer (<10% cases), but in ctDNA positive cases there was a significant association with relapse after surgical intervention alone. This raises the potential for including ctDNA detection as an additional tool for patient stratification in future neo/adjuvant treatment trials aiming to assess the impact of treatment escalation in men at high risk of relapse with current standard of care treatment alone.

Workflow for SNP genotyping using the Hi-Plex method, protocols.io, 2023.

Authors

Anne-Laure Besnard, Daniel J. Park, Bernard J. Pope, Fleur Hammet, Sophie Michon-Coudouel, Marine Biget, Stacy A. Krueger-Hadfield, Stéphane Mauger, Eric J. Petit

Journal

protocols.io

Year

2023

DOI

10.17504/protocols.io.8epv5jnnnl1b/v1

URL

https://doi.org/10.17504/protocols.io.8epv5jnnnl1b/v1

Abstract

Many research questions in ecology and evolution require balancing sampling strategies between their spatial (how many populations? on which geographical, environmental gradients?), temporal (diachronic approaches), and genomic (how many and which loci?) dimensions. High-throughput molecular biology protocols often offer very good genomic coverage, but this is often only achievable at the expense of other sampling dimensions. This has led to the development of targeted genotyping strategies for SNP locus sets, in addition to whole or reduced genome sequencing strategies. We here present an adaptation of a protocol developed by the University of Melbourne for genotyping rare variants in human oncology to non-model species for use in ecology and evolution. Hi-Plex is an amplicon sequencing technique in which all loci are co-amplified in a multiplex reaction before Illumina or Ion-Torrent sequencing (we used Illumina). Intermediate steps include dual indexing of individual samples used for demultiplexing.

A tumor focused approach to resolving the etiology of DNA mismatch repair deficient tumors classified as suspected Lynch syndrome, Journal of Translational Medicine, 2023.

Authors

Romy Walker, Khalid Mahmood, Jihoon E. Joo, Mark Clendenning, Peter Georgeson, Julia Como, Sharelle Joseland, Susan G. Preston, Yoland Antill, Rachel Austin, Alex Boussioutas, Michelle Bowman, Jo Burke, Ainsley Campbell, Simin Daneshvar, Emma Edwards, Margaret Gleeson, Annabel Goodwin, Marion T. Harris, Alex Henderson, Megan Higgins, John L. Hopper, Ryan A. Hutchinson, Emilia Ip, Joanne Isbister, Kais Kasem, Helen Marfan, Di Milnes, Annabelle Ng, Cassandra Nichols, Shona O’Connell, Nicholas Pachter, Bernard J. Pope, Nicola Poplawski, Abiramy Ragunathan, Courtney Smyth, Allan Spigelman, Kirsty Storey, Rachel Susman, Jessica A. Taylor, Linda Warwick, Mathilda Wilding, Rachel Williams, Aung K. Win, Michael D. Walsh, Finlay A. Macrae, Mark A. Jenkins, Christophe Rosty, Ingrid M. Winship, Daniel D. Buchanan, Family Cancer Clinics of Australia

Journal

Journal of Translational Medicine

Volume

21

Issue

282

Year

2023

DOI

10.1186/s12967-023-04143-1

Pubmed ID

37101184

URL

https://doi.org/10.1186/s12967-023-04143-1

Abstract

Routine screening of tumors for DNA mismatch repair (MMR) deficiency (dMMR) in colorectal (CRC), endometrial (EC) and sebaceous skin (SST) tumors leads to a significant proportion of unresolved cases classified as suspected Lynch syndrome (SLS). SLS cases (n = 135) were recruited from Family Cancer Clinics across Australia and New Zealand. Targeted panel sequencing was performed on tumor (n = 137; 80×CRCs, 33×ECs and 24xSSTs) and matched blood-derived DNA to assess for microsatellite instability status, tumor mutation burden, COSMIC tumor mutational signatures and to identify germline and somatic MMR gene variants. MMR immunohistochemistry (IHC) and MLH1 promoter methylation were repeated. In total, 86.9% of the 137 SLS tumors could be resolved into established subtypes. For 22.6% of these resolved SLS cases, primary MLH1 epimutations (2.2%) as well as previously undetected germline MMR pathogenic variants (1.5%), tumor MLH1 methylation (13.1%) or false positive dMMR IHC (5.8%) results were identified. Double somatic MMR gene mutations were the major cause of dMMR identified across each tumor type (73.9% of resolved cases, 64.2% overall, 70% of CRC, 45.5% of ECs and 70.8% of SSTs). The unresolved SLS tumors (13.1%) comprised tumors with only a single somatic (7.3%) or no somatic (5.8%) MMR gene mutations. A tumor-focused testing approach reclassified 86.9% of SLS into Lynch syndrome, sporadic dMMR or MMR-proficient cases. These findings support the incorporation of tumor sequencing and alternate MLH1 methylation assays into clinical diagnostics to reduce the number of SLS patients and provide more appropriate surveillance and screening recommendations.

Somatic mutation landscape in a cohort of meningiomas that have undergone grade progression, BMC Cancer, 2023.

Authors

Sarah A Cain, Bernard Pope, Stefano Mangiola, Theo Mantamadiotis, Katharine J Drummond

Journal

BMC Cancer

Volume

23

Year

2023

DOI

10.1186/s12885-023-10624-9

Pubmed ID

36882706

URL

https://doi.org/10.1186/s12885-023-10624-9

Keywords

Meningioma; Next generation sequencing; Anaplastic; Atypicall; Malignant

Abstract

Background
A subset of meningiomas progress in histopathological grade but drivers of progression are poorly understood. We aimed to identify somatic mutations and copy number alterations (CNAs) associated with grade progression in a unique matched tumour dataset.
Methods
Utilising a prospective database, we identified 10 patients with meningiomas that had undergone grade progression and for whom matched pre- and post-progression tissue (n = 50 samples) was available for targeted next-generation sequencing.

Results
Mutations in NF2 were identified in 4/10 patients, of these 94% were non-skull base tumours. In one patient, three different NF2 mutations were identified in four tumours. NF2 mutated tumours showed large-scale CNAs, with highly recurrent losses in 1p, 10, 22q, and frequent CNAs on chromosomes 2, 3 and 4. There was a correlation between grade and CNAs in two patients. Two patients with tumours without detected NF2 mutations showed a combination of loss and high gain on chromosome 17q. Mutations in SETD2, TP53, TERT promoter and NF2 were not uniform across recurrent tumours, however did not correspond with the onset of grade progression.
Conclusion
Meningiomas that progress in grade generally have a mutational profile already detectable in the pre-progressed tumour, suggesting an aggressive phenotype. CNA profiling shows frequent alterations in NF2 mutated tumours compared to non NF2 mutated tumours. The pattern of CNAs may be associated with grade progression in a subset of cases.

A polygenic two-hit hypothesis for prostate cancer, Journal of the National Cancer Institute, 2023.

Authors

Kathleen E Houlahan, Julie Livingstone, Natalie S Fox, Natalie Kurganovs, Helen Zhu, Jocelyn Sietsma Penington, Chol-Hee Jung, Takafumi N Yamaguchi, Lawrence E Heisler, Richard Jovelin, Anthony J Costello, Bernard J Pope, Amar U Kishan, Niall M Corcoran, Robert G Bristow, Sebastian M Waszak, Joachim Weischenfeldt, Housheng H He, Rayjean J Hung, Christopher M Hovens, Paul C Boutros

Journal

Journal of the National Cancer Institute

Year

2023

DOI

10.1093/jnci/djad001

Pubmed ID

36610996

URL

https://doi.org/10.1093/jnci/djad001

Abstract

Prostate cancer is one of the most heritable cancers. Hundreds of germline polymorphisms have been linked to prostate cancer diagnosis and prognosis. Polygenic risk scores can predict genetic risk of a prostate cancer diagnosis. While these scores inform on the probability of developing a tumor, it remains unknown how germline risk influences the tumor molecular evolution. We cultivated a cohort of 1,250 localized European-descent patients with germline and somatic DNA profiling. Men of European descent with higher genetic risk were diagnosed earlier, had less genomic instability, and fewer driver genes mutated. Higher genetic risk was associated with better outcome. These data imply a polygenic “two-hit” model where germline risk reduces the number of somatic alterations required for tumorigenesis. These findings support further clinical studies of PRS as inexpensive and minimally invasive adjuncts to standard risk stratification. Further studies are required to interrogate generalizability to more ancestrally and clinically diverse populations.

Evaluating multiple next-generation sequencing derived tumor features to accurately predict DNA mismatch repair status, The Journal of Molecular Diagnostics, 2022.

Authors

Romy Walker, Peter Georgeson, Khalid Mahmood, Jihoon Joo, Enes Makalic, Mark Clendenning, Julia Como, Susan Preston, Sharelle Joseland, Bernard Pope, Ryan Hutchinson, Kais Kasem, Michael Walsh, Finlay Macrae, Aung Win, John Hopper, Dmitri Mouradov, Peter Gibbs, Oliver Sieber, Dylan O'Sullivan, Darren Brenner, Steven Gallinger, Mark Jenkins, Christophe Rosty, Ingrid Winship, Daniel Buchanan

Journal

The Journal of Molecular Diagnostics

Year

2022

DOI

10.1016/j.jmoldx.2022.10.003

Pubmed ID

36396080

URL

https://doi.org/10.1016/j.jmoldx.2022.10.003

Keywords

Colorectal cancer; DNA mismatch repair deficiency; endometrial cancer; Lynch syndrome; microsatellite instability; MLH1 promoter methylation; sebaceous skin tumor; tumor mutation burden; tumor mutational signatures

Abstract

Identifying tumor DNA mismatch repair deficiency (dMMR) is important for precision medicine. Tumor features, individually and in combination, derived from whole-exome sequenced (WES) colorectal cancers (CRCs) and panel sequenced CRCs, endometrial cancers (ECs) and sebaceous skin tumors (SSTs) were assessed for their accuracy in detecting dMMR. CRCs (n=300) with WES, where MMR status was determined by immunohistochemistry, were assessed for microsatellite instability (MSMuTect, MANTIS, MSIseq, MSISensor), COSMIC tumor mutational signatures (TMS) and somatic mutation counts. A 10-fold cross-validation approach (100 repeats) evaluated the dMMR prediction accuracy for 1) individual features, 2) Lasso statistical model and 3) an additive feature combination approach. Panel sequenced tumors (29 CRCs, 22 ECs, 20 SSTs) were assessed for the top performing dMMR predicting features/models using these three approaches. For WES CRCs, 10 features provided >80% dMMR prediction accuracy, with MSMuTect, MSIseq, and MANTIS achieving ≥99% accuracy. The Lasso model achieved 98.3%. The additive feature approach with ≥3/6 of MSMuTect, MANTIS, MSIseq, MSISensor, INDEL count or TMS ID2+ID7 achieved 99.7% accuracy. For the panel sequenced tumors, the additive feature combination approach of ≥3/6 achieved accuracies of 100%, 95.5% and 100%, for CRCs, ECs, and SSTs, respectively. The microsatellite instability calling tools performed well in WES CRCs, however, an approach combining tumor features may improve dMMR prediction in both WES and panel sequenced data across tissue types.

VIVID: A Web Application for Variant Interpretation and Visualization in Multi-dimensional Analyses, Molecular Biology and Evolution, 2022.

Authors

Swapnil Tichkule, Yoochan Myung, Myo T Naung, Brendan R E Ansell, Andrew J Guy, Namrata Srivastava, Somya Mehra, Simone M Cacciò, Ivo Mueller, Alyssa E Barry, Cock van Oosterhout, Bernard Pope, David B Ascher, Aaron R Jex

Journal

Molecular Biology and Evolution

Volume

39

Issue

9

Year

2022

DOI

10.1093/molbev/msac196

Pubmed ID

36103257

URL

https://doi.org/10.1093/molbev/msac196

Keywords

data visualization; evolution; multi-dimensional analysis; population genetics; protein structure; variant interpretation

Abstract

Large-scale comparative genomics- and population genetic studies generate enormous amounts of polymorphism data in the form of DNA variants. Ultimately, the goal of many of these studies is to associate genetic variants to phenotypes or fitness. We introduce VIVID, an interactive, user-friendly web application that integrates a wide range of approaches for encoding genotypic to phenotypic information in any organism or disease, from an individual or population, in three-dimensional (3D) space. It allows mutation mapping and annotation, calculation of interactions and conservation scores, prediction of harmful effects, analysis of diversity and selection, and 3D visualization of genotypic information encoded in Variant Call Format on AlphaFold2 protein models. VIVID enables the rapid assessment of genes of interest in the study of adaptive evolution and the genetic load, and it helps prioritizing targets for experimental validation. We demonstrate the utility of VIVID by exploring the evolutionary genetics of the parasitic protist Plasmodium falciparum, revealing geographic variation in the signature of balancing selection in potential targets of functional antibodies.

Perish and publish: Dynamics of biomedical publications by deceased authors, PLOS ONE, 2022.

Authors

Chol-Hee Jung, Paul C. Boutros, Daniel J. Park, Niall M. Corcoran, Bernard J. Pope, Christopher M. Hovens

Journal

PLOS ONE

Year

2022

DOI

10.1371/journal.pone.0273783

URL

https://doi.org/10.1371/journal.pone.0273783

Abstract

The question of whether it is appropriate to attribute authorship to deceased individuals of original studies in the biomedical literature is contentious. Authorship guidelines utilized by journals do not provide a clear consensus framework that is binding on those in the field. To guide and inform the implementation of authorship frameworks it would be useful to understand the extent of the practice in the scientific literature, but studies that have systematically quantified the prevalence of this phenomenon in the biomedical literature have not been performed to date. To address this issue, we quantified the prevalence of publications by deceased authors in the biomedical literature from the period 1990–2020. We screened 2,601,457 peer-reviewed papers from the full text Europe PubMed Central database. We applied natural language processing, stringent filtering and manual curation to identify a final set of 1,439 deceased authors. We then determined these authors published a total of 38,907 papers over their careers with 5,477 published after death. The number of deceased publications has been growing rapidly, a 146-fold increase since the year 2000. This rate of increase was still significant when accounting for the growing total number of publications and pool of authors. We found that more than 50% of deceased author papers were first submitted after the death of the author and that over 60% of these papers failed to acknowledge the deceased authors status. Most deceased authors published less than 10 papers after death but a small pool of 30 authors published significantly more. A pool of 266 authors published more than 90% of their total publications after death. Our analysis indicates that the attribution of deceased authorship in the literature is not an occasional occurrence but a burgeoning trend. A consensus framework to address authorship by deceased scientists is warranted.

Identifying colorectal cancer caused by biallelic MUTYH pathogenic variants using tumor mutational signatures, Nature Communications, 2022.

Authors

Peter Georgeson, Tabitha A. Harrison, Bernard J. Pope, Syed H. Zaidi, Conghui Qu, Robert S. Steinfelder, Yi Lin, Jihoon E. Joo, Khalid Mahmood, Mark Clendenning, Romy Walker, Efrat L. Amitay, Sonja I. Berndt, Hermann Brenner, Peter T. Campbell, Yin Cao, Andrew T. Chan, Jenny Chang-Claude, Kimberly F. Doheny, David A. Drew, Jane C. Figueiredo, Amy J. French, Steven Gallinger, Marios Giannakis, Graham G. Giles, Andrea Gsur, Marc J. Gunter, Michael Hoffmeister, Li Hsu, Wen-Yi Huang, Paul Limburg, JoAnn E. Manson, Victor Moreno, Rami Nassir, Jonathan A. Nowak, Mireia Obón-Santacana, Shuji Ogino, Amanda I. Phipps, John D. Potter, Robert E. Schoen, Wei Sun, Amanda E. Toland, Quang M. Trinh, Tomotaka Ugai, Finlay A. Macrae, Christophe Rosty, Thomas J. Hudson, Mark A. Jenkins, Stephen N. Thibodeau, Ingrid M. Winship, Ulrike Peters, Daniel D. Buchanan

Journal

Nature Communications

Volume

13

Year

2022

DOI

10.1038/s41467-022-30916-1

URL

https://doi.org/10.1038/s41467-022-30916-1

Abstract

Carriers of germline biallelic pathogenic variants in the MUTYH gene have a high risk of colorectal cancer. We test 5649 colorectal cancers to evaluate the discriminatory potential of a tumor mutational signature specific to MUTYH for identifying biallelic carriers and classifying variants of uncertain clinical significance (VUS). Using a tumor and matched germline targeted multi-gene panel approach, our classifier identifies all biallelic MUTYH carriers and all known non-carriers in an independent test set of 3019 colorectal cancers (accuracy = 100% (95% confidence interval 99.87–100%)). All monoallelic MUTYH carriers are classified with the non-MUTYH carriers. The classifier provides evidence for a pathogenic classification for two VUS and a benign classification for five VUS. Somatic hotspot mutations KRAS p.G12C and PIK3CA p.Q546K are associated with colorectal cancers from biallelic MUTYH carriers compared with non-carriers (p = 2 × 10−23 and p = 6 × 10−11, respectively). Here, we demonstrate the potential application of mutational signatures to tumor sequencing workflows to improve the identification of biallelic MUTYH carriers.

Rare Germline Variants Are Associated with Rapid Biochemical Recurrence After Radical Prostate Cancer Treatment: A Pan Prostate Cancer Group Study, European Urology, 2022.

Authors

Daniel Burns, Ezequiel Anokian, Edward J. Saunders, Robert G. Bristow, Michael Fraser, Juri Reimand, Thorsten Schlomm, Guido Sauter, Benedikt Brors, Jan Korbel, Joachim Weischenfeldt, Sebastian M. Waszak, Niall M. Corcoran, Chol-Hee Jung, Bernard J. Pope, Chris M. Hovens, Geraldine Cancel-Tassin, Olivier Cussenot, Massimo Loda, Chris Sander, Vanessa M. Hayes, Karina Dalsgaard Sorensen, Yong-Jie Lu, Freddie C. Hamdy, Christopher S. Foster, Vincent Gnanapragasam, Adam Butler, Andy G. Lynch, Charlie E. Massie, Dan J. Woodcock, Colin S. Cooper, David C. Wedge, Daniel S. Brewer, Zsofia Kote-Jarai, Rosalind A. Eeles

Journal

European Urology

Year

2022

DOI

10.1016/j.eururo.2022.05.007

Pubmed ID

35659150

URL

https://doi.org/10.1016/j.eururo.2022.05.007

Keywords

Germline variants; Prostate cancer; Biochemical recurrence; Pan Prostate Cancer Group

Abstract

Background
Germline variants explain more than a third of prostate cancer (PrCa) risk, but very few associations have been identified between heritable factors and clinical progression.
Objective
To find rare germline variants that predict time to biochemical recurrence (BCR) after radical treatment in men with PrCa and understand the genetic factors associated with such progression.
esign, setting, and participant
Whole-genome sequencing data from blood DNA were analysed for 850 PrCa patients with radical treatment from the Pan Prostate Cancer Group (PPCG) consortium from the UK, Canada, Germany, Australia, and France. Findings were validated using 383 patients from The Cancer Genome Atlas (TCGA) dataset.
Outcome measurements and statistical analysis
A total of 15, 822 rare (MAF <1%) predicted-deleterious coding germline mutations were identified. Optimal multifactor and univariate Cox regression models were built to predict time to BCR after radical treatment, using germline variants grouped by functionally annotated gene sets. Models were tested for robustness using bootstrap resampling.
Results and limitations
Optimal Cox regression multifactor models showed that rare predicted-deleterious germline variants in “Hallmark” gene sets were consistently associated with altered time to BCR. Three gene sets had a statistically significant association with risk-elevated outcome when modelling all samples: PI3K/AKT/mTOR, Inflammatory response, and KRAS signalling (up). PI3K/AKT/mTOR and KRAS signalling (up) were also associated among patients with higher-grade cancer, as were Pancreas-beta cells, TNFA signalling via NKFB, and Hypoxia, the latter of which was validated in the independent TCGA dataset.
Conclusions
We demonstrate for the first time that rare deleterious coding germline variants robustly associate with time to BCR after radical treatment, including cohort-independent validation. Our findings suggest that germline testing at diagnosis could aid clinical decisions by stratifying patients for differential clinical management.

Phase 2 Study of Neoadjuvant FGFR Inhibition and Androgen Deprivation Therapy Prior to Prostatectomy, Clinical Genitourinary Cancer, 2022.

Authors

Elizabeth Liow, Nicholas Howard, Chol-Hee Jung, Bernard Pope, Bethany K. Campbell, Anne Nguyen, Michael Kerger, Jonathan B. Ruddle, Angelyn Anton, Benjamin Thomas, Kevin Chu, Philip Dundee, Justin S. Peters, Anthony J. Costello, Andrew S. Ryan, Christopher M. Hovens, Ben Tran, Niall M. Corcoran

Journal

Clinical Genitourinary Cancer

Year

2022

DOI

10.1016/j.clgc.2022.05.007

URL

https://doi.org/10.1016/j.clgc.2022.05.007

Keywords

High-risk Prostate Cancer; Neoadjuvant therapy; FGFR Inhibition; Androgen Deprivation Therapy; Phase 2 Study

Abstract

Background
Disease recurrence is common following prostatectomy in patients with localised prostate cancer with high-risk features. Although androgen deprivation therapy increases the rates of organ-confined disease and negative surgical margins, there is no significant benefit on disease recurrence. Multiple lines of evidence suggest that FGF/FGFR signalling is important in supporting prostate epithelial cell survival in hostile conditions, including acute androgen deprivation. Given the recent availability of oral FGFR inhibitors, we investigated whether combination therapy could improve tumour response in the neo-adjuvant setting.
Methods
We conducted an open label phase II study of the combination of erdafitinib (3 months) and androgen deprivation therapy (4 months) in men with localised prostate cancer with high-risk features prior to prostatectomy using a Simon's two stage design. The co-primary endpoints were safety and tolerability and pathological response in the prostatectomy specimen. The effect of treatment on residual tumours was explored by global transcriptional profiling with RNA-sequencing.
Results
Nine patients were enrolled in the first stage of the trial. The treatment combination was poorly tolerated. Erdafitinib treatment was discontinued early in 6 patients, three of whom also required dose interruptions/reductions. Androgen deprivation therapy for 4 months was completed in all patients. The most common adverse events were hyperphosphataemia, taste disturbance, dry mouth and nail changes. No patients achieved a complete pathological response, although patients who tolerated erdafitinib for longer had smaller residual tumours, associated with reduced transcriptional signatures of epithelial cell proliferation.
Conclusions
Although there was a possible enhanced anti-tumour effect of androgen deprivation therapy in combination with erdafitnib in treatment naïve prostate cancer, the poor tolerability in this patient population prohibits the use of this combination in this setting.
Clinical Practice Points
Disease recurrence is common following prostatectomy in patients with localised prostate cancer with high-risk features. Although androgen deprivation therapy increases the rates of organ-confined disease and negative surgical margins, there is no significant benefit on disease recurrence. Multiple lines of evidence suggest that FGF/FGFR signalling is important in supporting prostate epithelial cell survival in hostile conditions, including acute androgen deprivation. We conducted an open label phase II study of the combination of erdafitinib (3 months) and androgen deprivation therapy (4 months) in men with localised prostate cancer with high-risk features prior to prostatectomy using a Simon's two stage design. The co-primary endpoints were safety and tolerability and pathological response in the prostatectomy specimen. The treatment combination was poorly tolerated. The most common adverse events were hyperphosphataemia, taste disturbance, dry mouth and nail changes. No patients achieved a complete pathological response, although patients who tolerated erdafitinib for longer had smaller residual tumours, associated with reduced transcriptional signatures of epithelial cell proliferation. Although there was a possible enhanced anti-tumour effect of androgen deprivation therapy in combination with erdafitnib in treatment naïve prostate cancer, the poor tolerability in this patient population prohibits the use of this combination in this setting.

Population-based estimates of breast cancer risk for carriers of pathogenic variants identified by gene-panel testing, npj Breast Cancer, 2021.

Authors

Melissa C. Southey, James G. Dowty, Moeen Riaz, Jason A. Steen, Anne-Laure Renault, Katherine Tucker, Judy Kirk, Paul James, Ingrid Winship, Nicholas Pachter, Nicola Poplawski, Scott Grist, Daniel J. Park, Bernard J. Pope, Khalid Mahmood, Fleur Hammet, Maryam Mahmoodi, Helen Tsimiklis, Derrick Theys, Amanda Rewse, Amanda Willis, April Morrow, Catherine Speechly, Rebecca Harris, Robert Sebra, Eric Schadt, Paul Lacaze, John J. McNeil, Graham G. Giles, Roger L. Milne, John L. Hopper, Tú Nguyen-Dumont

Journal

npj Breast Cancer

Year

2021

DOI

10.1038/s41523-021-00360-3

Pubmed ID

34887416

URL

https://doi.org/10.1038/s41523-021-00360-3

Abstract

Population-based estimates of breast cancer risk for carriers of pathogenic variants identified by gene-panel testing are urgently required. Most prior research has been based on women selected for high-risk features and more data is needed to make inference about breast cancer risk for women unselected for family history, an important consideration of population screening. We tested 1464 women diagnosed with breast cancer and 862 age-matched controls participating in the Australian Breast Cancer Family Study (ABCFS), and 6549 healthy, older Australian women enroled in the ASPirin in Reducing Events in the Elderly (ASPREE) study for rare germline variants using a 24-gene-panel. Odds ratios (ORs) were estimated using unconditional logistic regression adjusted for age and other potential confounders. We identified pathogenic variants in 11.1% of the ABCFS cases, 3.7% of the ABCFS controls and 2.2% of the ASPREE (control) participants. The estimated breast cancer OR [95% confidence interval] was 5.3 [2.1–16.2] for BRCA1, 4.0 [1.9–9.1] for BRCA2, 3.4 [1.4–8.4] for ATM and 4.3 [1.0–17.0] for PALB2. Our findings provide a population-based perspective to gene-panel testing for breast cancer predisposition and opportunities to improve predictors for identifying women who carry pathogenic variants in breast cancer predisposition genes.

SNPPar: identifying convergent evolution and other homoplasies from microbial whole-genome alignments, Microbial Genomics, 2021.

Authors

David J. Edwards, Sebastián Duchene, Bernard Pope, Kathryn E. Holt

Journal

Microbial Genomics

Volume

7

Issue

12

Year

2021

DOI

10.1099/mgen.0.000694

Pubmed ID

34874243

URL

https://doi.org/10.1099/mgen.0.000694

Keywords

bacteria; evolution; homoplasy; phylogeny

Abstract

Homoplasic SNPs are considered important signatures of strong (positive) selective pressure, and hence of adaptive evolution for clinically relevant traits such as antibiotic resistance and virulence. Here we present a new tool, SNPPar, for efficient detection and analysis of homoplasic SNPs from large whole genome sequencing datasets (>1000 isolates and/or >100 000 SNPs). SNPPar takes as input an SNP alignment, tree and annotated reference genome, and uses a combination of simple monophyly tests and ancestral state reconstruction (ASR, via TreeTime) to assign mutation events to branches and identify homoplasies. Mutations are annotated at the level of codon and gene, to facilitate analysis of convergent evolution. Testing on simulated data (120 Mycobacterium tuberculosis alignments representing local and global samples) showed SNPPar can detect homoplasic SNPs with very high specificity (zero false-positives in all tests) and high sensitivity (zero false-negatives in 89 % of tests). SNPPar analysis of three empirically sampled datasets (Elizabethkingia anophelis, Burkholderia dolosa and M. tuberculosis) produced results that were in concordance with previous studies, in terms of both individual homoplasies and evidence of convergence at the codon and gene levels. SNPPar analysis of a simulated alignment of ~64 000 genome-wide SNPs from 2000 M. tuberculosis genomes took ~23 min and ~2.6 GB of RAM to generate complete annotated results on a laptop. This analysis required ASR be conducted for only 1.25 % of SNPs, and the ASR step took ~23 s and 0.4 GB of RAM. SNPPar automates the detection and annotation of homoplasic SNPs efficiently and accurately from large SNP alignments. As demonstrated by the examples included here, this information can be readily used to explore the role of homoplasy in parallel and/or convergent evolution at the level of nucleotide, codon and/or gene.

Rare germline variants in the AXIN2 gene in families with colonic polyposis and colorectal cancer, Familial Cancer, 2021.

Authors

James M Chan, Mark Clendenning, Sharelle Joseland, Peter Georgeson2, Khalid Mahmood, Romy Walker, Julia Como, Jihoon E Joo, Susan Preston, Ryan A Hutchinson, Bernard J Pope, Andrew Metz, Catherine Beard, Rebecca Purvis, Julie Arnold, Varnika Vijay, Galina Konycheva, Nathan Atkinson, Susan Parry, Mark A Jenkins, Finlay A Macrae, Christophe Rosty, Ingrid M Winship, Daniel D Buchanan

Journal

Familial Cancer

Year

2021

DOI

10.1007/s10689-021-00283-9

Pubmed ID

34817745

URL

https://doi.org/10.1007/s10689-021-00283-9

Keywords

AXIN2; Colonic polyposis; Colorectal cancer; Ectodermal dysplasia; Oligodontia

Abstract

Germline loss-of-function variants in AXIN2 are associated with oligodontia and ectodermal dysplasia. The association between colorectal cancer (CRC) and colonic polyposis is less clear despite this gene now being included in multi-gene panels for CRC. Study participants were people with genetically unexplained colonic polyposis recruited to the Genetics of Colonic Polyposis Study who had a rare germline AXIN2 gene variant identified from either clinical multi-gene panel testing (n=2) or from whole genome/exome sequencing (n=2). Variant segregation in relatives and characterisation of tumour tissue were performed where possible. Four different germline pathogenic variants in AXIN2 were identified in four families. Five of the seven carriers of the c.1049delC, p.Pro350Leufs*13 variant, two of the six carriers of the c.1994dupG, p.Asn666Glnfs*41 variant, all three carriers of c.1972delA, p.Ser658Alafs*31 variant and the single proband carrier of the c.2405G>C, p.Arg802Thr variant, which creates an alternate splice form resulting in a frameshift mutation (p.Glu763Ilefs*42), were affected by CRC and/or polyposis. Carriers had a mean age at diagnosis of CRC/polyposis of 52.5 ± 9.2 years. Colonic polyps were typically pan colonic with counts ranging from 5 to >100 (median 12.5) comprising predominantly adenomatous polyps but also serrated polyps. Two CRCs from carriers displayed evidence of a second hit via loss of heterozygosity. Oligodontia was observed in carriers from two families. Germline AXIN2 pathogenic variants from four families were associated with CRC and/or polyposis in multiple family members. These findings support the inclusion of AXIN2 in CRC and polyposis multigene panels for clinical testing.

Long-read assembly and comparative evidence-based reanalysis of Cryptosporidium genome sequences reveals expanded transporter repertoire and duplication of entire chromosome ends including subtelomeric regions, Genome Research, 2021.

Authors

Rodrigo P Baptista, Yiran Li, Adam Sateriale, Karen L Brooks, Alan Tracey, Mandy J Sanders, Brendan R E Ansell, Aaron R Jex, Garrett W Cooper, Ethan D Smith, Rui Xiao, Jennifer E Dumaine, Peter Georgeson, Bernard Pope, Matthew Berriman, Boris Striepen, James A Cotton, Jessica C Kissinger

Journal

Genome Research

Year

2021

DOI

10.1101/gr.275325.121

Pubmed ID

34764149

URL

https://doi.org/10.1101/gr.275325.121

Abstract

Cryptosporidiosis is a leading cause of waterborne diarrheal disease globally and an important contributor to mortality in infants and the immunosuppressed. Despite its importance, the Cryptosporidium community has only had access to a good, but incomplete, Cryptosporidium parvum IOWA reference genome sequence. Incomplete reference sequences hamper annotation, experimental design and interpretation. We have generated a new C. parvum IOWA genome assembly supported by PacBio and Oxford Nanopore long-read technologies and a new comparative and consistent genome annotation for three closely related species C. parvum, Cryptosporidium hominis and Cryptosporidium tyzzeri We made 1,926 C. parvum annotation updates based on experimental evidence. They include new transporters, ncRNAs, introns and altered gene structures. The new assembly and annotation revealed a complete Dnmt2 methylase ortholog. Comparative annotation between C. parvum, C. hominis and C. tyzzeri revealed that most "missing" orthologs are found suggesting that the biological differences between the species must result from gene copy number variation, differences in gene regulation and single nucleotide variants (SNVs). Using the new assembly and annotation as reference, 190 genes are identified as evolving under positive selection, including many not detected previously. The new C. parvum IOWA reference genome assembly is larger, gap free and lacks ambiguous bases. This chromosomal assembly recovers all 16 chromosome ends, 13 of which are contiguously assembled. The three remaining chromosome ends are provisionally placed. These ends represent duplication of entire chromosome ends including subtelomeric regions revealing a new level of genome plasticity that will both inform and impact future research.

MSH2-deficient prostate tumours have a distinct immune response and clinical outcome compared to MSH2-deficient colorectal or endometrial cancer, Prostate Cancer and Prostatic Diseases, 2021.

Authors

Patrick McCoy, Stefano Mangiola, Geoff Macintyre, Ryan Hutchinson, Ben Tran, Bernard Pope, Peter Georgeson, Matthew K. H. Hong, Natalie Kurganovs, Sebastian Lunke, Michael J. Clarkson, Marek Cmero, Michael Kerger, Ryan Stuchbery, Ken Chow, Izhak Haviv, Andrew Ryan, Anthony J. Costello, Niall M. Corcoran, Christopher M. Hovens

Journal

Prostate Cancer and Prostatic Diseases

Year

2021

DOI

10.1038/s41391-021-00379-4

URL

https://doi.org/10.1038/s41391-021-00379-4

Abstract

Background
Recent publications have shown patients with defects in the DNA mismatch repair (MMR) pathway driven by either MSH2 or MSH6 loss experience a significant increase in the incidence of prostate cancer. Moreover, this increased incidence of prostate cancer is accompanied by rapid disease progression and poor clinical outcomes.
Methods and results
We show that androgen-receptor activation, a key driver of prostate carcinogenesis, can disrupt the MSH2 gene in prostate cancer. We screened tumours from two cohorts (recurrent/non-recurrent) of prostate cancer patients to confirm the loss of MSH2 protein expression and identified decreased MSH2 expression in recurrent cases. Stratifying the independent TCGA prostate cancer cohort for MSH2/6 expression revealed that patients with lower levels of MSH2/6 had significant worse outcomes, in contrast, endometrial and colorectal cancer patients with lower MSH2/6 levels. MMRd endometrial and colorectal tumours showed the expected increase in mutational burden, microsatellite instability and enhanced immune cell mobilisation but this was not evident in prostate tumours.
Conclusions
We have shown that loss or reduced levels of MSH2/MSH6 protein in prostate cancer is associated with poor outcome. However, our data indicate that this is not associated with a statistically significant increase in mutational burden, microsatellite instability or immune cell mobilisation in a cohort of primary prostate cancers.

Evaluating the utility of tumour mutational signatures for identifying hereditary colorectal cancer and polyposis syndrome carriers, Gut, 2021.

Authors

Peter Georgeson, Bernard J Pope, Christophe Rosty, Mark Clendenning, Khalid Mahmood, Jihoon E Joo, Romy Walker, Ryan A Hutchinson, Susan Preston, Julia Como, Sharelle Joseland, Aung Ko Win, Finlay A Macrae, John L Hopper, Dmitri Mouradov, Peter Gibbs, Oliver M Sieber, Dylan E O'Sullivan, Darren R Brenner, Steve Gallinger, Mark A Jenkins, Ingrid M Winship, Daniel D Buchanan

Journal

Gut

Year

2021

DOI

10.1136/gutjnl-2019-320462

Pubmed ID

33414168

URL

https://doi.org/10.1136/gutjnl-2019-320462

Keywords

colorectal cancer; colorectal cancer screening; molecular pathology; mutations; tumour markers

Abstract

Objective
Germline pathogenic variants (PVs) in the DNA mismatch repair (MMR) genes and in the base excision repair gene MUTYH underlie hereditary colorectal cancer (CRC) and polyposis syndromes. We evaluated the robustness and discriminatory potential of tumour mutational signatures in CRCs for identifying germline PV carriers.
Design
Whole-exome sequencing of formalin-fixed paraffin-embedded (FFPE) CRC tissue was performed on 33 MMR germline PV carriers, 12 biallelic MUTYH germline PV carriers, 25 sporadic MLH1 methylated MMR-deficient CRCs (MMRd controls) and 160 sporadic MMR-proficient CRCs (MMRp controls) and included 498 TCGA CRC tumours. COSMIC V3 single base substitution (SBS) and indel (ID) mutational signatures were assessed for their ability to differentiate CRCs that developed in carriers from non-carriers.
Results
The combination of mutational signatures SBS18 and SBS36 contributing >30% of a CRC’s signature profile was able to discriminate biallelic MUTYH carriers from all other non-carrier control CRCs with 100% accuracy (area under the curve (AUC) 1.0). SBS18 and SBS36 were associated with specific MUTYH variants p.Gly396Asp (p=0.025) and p.Tyr179Cys (p=5×10-5), respectively. The combination of ID2 and ID7 could discriminate the 33 MMR PV carrier CRCs from the MMRp control CRCs (AUC 0.99); however, SBS and ID signatures, alone or in combination, could not provide complete discrimination (AUC 0.79) between CRCs from MMR PV carriers and sporadic MMRd controls.
Conclusion
Assessment of SBS and ID signatures can discriminate CRCs from biallelic MUTYH carriers and MMR PV carriers from non-carriers with high accuracy, demonstrating utility as a potential diagnostic and variant classification tool.

Germline and Tumor Sequencing as a Diagnostic Tool to Resolve Suspected Lynch Syndrome, The Journal of Molecular Diagnostics, 2020.

Authors

Bernard J. Pope, Mark Clendenning, Christophe Rosty, Khalid Mahmood, Peter Georgeson, Jihoon E. Joo, Romy Walker, Ryan A. Hutchinson, Harindra Jayasekara, Sharelle Joseland, Julia Como, Susan Preston, Amanda B. Spurdle, Finlay A. Macrae, Aung K. Win, John L. Hopper, Mark A. Jenkins, Ingrid M. Winship, Daniel D. Buchanan

Journal

The Journal of Molecular Diagnostics

Year

2020

DOI

10.1016/j.jmoldx.2020.12.003

Pubmed ID

33383211

URL

https://doi.org/10.1016/j.jmoldx.2020.12.003

Keywords

Lynch syndrome; Suspected Lynch syndrome; Whole Genome Sequencing; Mismatch Repair Deficiency; double somatic mutations; colorectal cancer

Abstract

People who develop mismatch repair (MMR) deficient cancer in the absence of a germline MMR gene pathogenic variant or somatic hypermethylation of the MLH1 gene promoter are classified as having suspected Lynch syndrome (SLS). Germline whole genome sequencing (WGS) and targeted and genome-wide tumor sequencing was applied to identify the underlying cause of tumor MMR-deficiency in SLS. Germline WGS was performed on 14 cancer-affected people with SLS, including two sets of first-degree relatives. Germline pathogenic variants, including complex structural rearrangements and non-coding variants, were assessed for the MMR genes. Tumor tissue was sequenced for somatic MMR gene mutations by targeted, whole exome sequencing (WES) or WGS. Germline WGS identified pathogenic MMR variants in 3 of the 14 (21.4%) SLS cases including a 9.5Mb inversion disrupting MSH2 in a mother and daughter. Excluding these 3 MMR carriers, tumor sequencing identified at least two somatic MMR gene mutations in 8/11 (72.7%) tumors tested. In a second mother-daughter pair, a somatic cause of their tumor MMR-deficiency was supported by the presence of double somatic MSH2 mutations in their respective tumors. More than 70% of SLS were resolved as having double somatic MMR mutations in the absence of germline pathogenic variants in the MMR or other DNA repair-related genes as determine by WGS and, therefore, confidently assigned a non-inherited cause for their tumor MMR-deficiency.

Monoallelic NTHL1 Loss of Function Variants and Risk of Polyposis and Colorectal Cancer, Gastroenterology, 2020.

Authors

Fadwa A Elsayed, Judith E Grolleman, Abiramy Ragunathan, Arnoud Boot, Marija Staninova Stojovska, Khalid Mahmood, Mark Clendenning, Noel de Miranda, Dagmara Dymerska, Demi van Egmond, Steven Gallinger, Peter Georgeson, Nicoline Hoogerbrugge, John L. Hopper, Erik A.M. Jansen, Mark A. Jenkins, Jihoon E. Joo, Roland P. Kuiper, Marjolijn J.L. Ligtenberg, Jan Lubinski, Finlay A. Macrae, Hans Morreau, Polly Newcomb, Maartje Nielsen, Claire Palles, Daniel J. Park, Bernard J. Pope, Christophe Rosty, Clara Ruiz Ponte, Hans K. Schackert, Rolf H. Sijmons, Ian P. Tomlinson, Carli M. J. Tops, Lilian Vreede, Romy Walker, Aung K. Win, Colon Cancer Family Registry Cohort Investigators, Aleksandar J. Dimovski, Ingrid M. Winship, Daniel D Buchanan, Tom van Wezel, Richarda M de Voer

Journal

Gastroenterology

Year

2020

DOI

10.1053/j.gastro.2020.08.042

Pubmed ID

32860789

URL

https://doi.org/10.1053/j.gastro.2020.08.042

Detection of ctDNA in plasma of patients with clinically localised prostate cancer is associated with rapid disease progression, Genome Medicine, 2020.

Authors

Edmund Lau, Patrick McCoy, Fairleigh Reeves, Ken Chow, Michael Clarkson, Edmond M. Kwan, Kate Packwood, Helen Northen, Miao He, Zoya Kingsbury, Stefano Mangiola, Michael Kerger, Marc A. Furrer, Helen Crowe, Anthony J. Costello, David J. McBride, Mark T. Ross, Bernard Pope, Christopher M. Hovens, Niall M. Corcoran

Journal

Genome Medicine

Volume

12

Issue

1

Year

2020

DOI

10.1186/s13073-020-00770-1

Pubmed ID

32807235

URL

https://doi.org/10.1186/s13073-020-00770-1

Abstract

Background
DNA originating from degenerate tumour cells can be detected in the circulation in many tumour types, where it can be used as a marker of disease burden as well as to monitor treatment response. Although circulating tumour DNA (ctDNA) measurement has prognostic/predictive value in metastatic prostate cancer, its utility in localised disease is unknown.
Methods
We performed whole-genome sequencing of tumour-normal pairs in eight patients with clinically localised disease undergoing prostatectomy, identifying high confidence genomic aberrations. A bespoke DNA capture and amplification panel against the highest prevalence, highest confidence aberrations for each individual was designed and used to interrogate ctDNA isolated from plasma prospectively obtained pre- and post- (24 h and 6 weeks) surgery. In a separate cohort (n = 189), we identified the presence of ctDNA TP53 mutations in preoperative plasma in a retrospective cohort and determined its association with biochemical- and metastasis-free survival.
Results
Tumour variants in ctDNA were positively identified pre-treatment in two of eight patients, which in both cases remained detectable postoperatively. Patients with tumour variants in ctDNA had extremely rapid disease recurrence and progression compared to those where variants could not be detected. In terms of aberrations targeted, single nucleotide and structural variants outperformed indels and copy number aberrations. Detection of ctDNA TP53 mutations was associated with a significantly shorter metastasis-free survival (6.2 vs. 9.5 years (HR 2.4; 95% CIs 1.2–4.8, p = 0.014).
Conclusions
CtDNA is uncommonly detected in localised prostate cancer, but its presence portends more rapidly progressive disease.

Genetic testing in Poland and Ukraine: should comprehensive germline testing of BRCA1 and BRCA2 be recommended for women with breast and ovarian cancer?, Genetics Research, 2020.

Authors

Tu Nguyen-Dumont, Pawel Karpinski Maria M. Sasiadek, Hayane Akopyan, Jason A. Steen, Derrick Theys, Fleur Hammet, Helen Tsimiklis, Daniel J. Park, Bernard J. Pope, Ryszard Slezak, Agnieszka Stembalska, Karolina Pesz, Nataliya Kitsera, Aleksandra Siekierzynska, Melissa C. Southey, Aleksander Myszka

Journal

Genetics Research

Volume

102

Year

2020

DOI

10.1017/S0016672320000075

URL

https://doi.org/10.1017/S0016672320000075

Abstract

Purpose
To characterize the spectrum of BRCA1 and BRCA2 pathogenic germline variants in women from south-west Poland and west Ukraine affected with breast or ovarian cancer. Testing in women at high risk of breast and ovarian cancer in these regions is currently mainly limited to founder mutations.
Methods
Unrelated women affected with breast and/or ovarian cancer from Poland (n = 337) and Ukraine (n = 123) were screened by targeted sequencing. Excluded from targeted sequencing were 34 Polish women who had previously been identified as carrying a founder mutation in BRCA1. No prior testing had been conducted among the Ukrainian women. Thus, this study screened BRCA1 and BRCA2 in the germline DNA of 426 women in total.
Results
We identified 31 and 18 women as carriers of pathogenic/likely pathogenic (P/LP) genetic variants in BRCA1 and BRCA2, respectively. We observed five BRCA1 and eight BRCA2 P/LP variants (13/337, 3.9%) in the Polish women. Combined with the 34/337 (10.1%) founder variants identified prior to this study, the overall P/LP variant frequency in the Polish women was thus 14% (47/337). Among the Ukrainian women, 16/123 (13%) women were identified as carrying a founder mutation and 20/123 (16.3%) were found to carry non-founder P/LP variants (10 in BRCA1 and 10 in BRCA2).
Conclusions
These results indicate that genetic testing in women at high risk of breast and ovarian cancer in Poland and Ukraine should not be limited to founder mutations. Extended testing will enhance risk stratification and management for these women and their families.

HiTIME: An efficient model-selection approach for the detection of unknown drug metabolites in LC-MS data, SoftwareX, 2020.

Authors

Michael G. Leeming, Andrew P. Isaac, Luke Zappia, Richard A. J. O'Hair, William A. Donald, Bernard J. Pope

Journal

SoftwareX

Volume

12

Year

2020

DOI

10.1016/j.softx.2020.100559

URL

https://doi.org/10.1016/j.softx.2020.100559

Abstract

The identification of metabolites plays an important role in understanding drug efficacy and safety however these compounds are often difficult to identify in complex mixtures. One approach to identify drug metabolites involves utilising differentially isotopically labelled drug compounds to create unique isotopic signals that can be detected by liquid chromatography-mass spectrometry (LC-MS). User-friendly, efficient, computational tools that allow selective detection of these signals are lacking. We have developed an efficient open-source software tool called HiTIME (High-Resolution Twin-Ion Metabolite Extraction) which filters twin-ion signals in LC-MS data. The intensity of each data point in the input is replaced by a Z-score describing how well the point matches an idealised twin-ion signal versus alternative ion signatures. Here we provide a detailed description of the algorithm and demonstrate its performance on simulated and experimental data.

Rare germline genetic variants and risk of aggressive prostate cancer, International Journal of Cancer, 2020.

Authors

Nguyen-Dumont T, MacInnis R J, Steen J A, Theys D, Tsimiklis H, Hammet F, Mahmoodi M, Pope B J, Park D J, Mahmood K, Severi G, Bolton D, Milne R L, Giles G G, Southey M C

Journal

International Journal of Cancer

Year

2020

DOI

10.1002/ijc.33024

Pubmed ID

32338768

URL

https://doi.org/10.1002/ijc.33024

Keywords

aggressive prostate cancer; gene panel testing; germline genetic variants

Abstract

Few genetic risk factors have been demonstrated to be specifically associated with aggressive prostate cancer (PrCa). Here, we report a case-case study of PrCa comparing the prevalence of germline pathogenic/likely pathogenic (P/LP) genetic variants in 787 men with aggressive disease and 769 with non-aggressive disease. Overall, we observed P/LP variants in 11.4% of men with aggressive PrCa and 9.8% of men with non-aggressive PrCa (two-tailed Fisher's exact tests, P = 0.28). The proportion of BRCA2 and ATM P/LP variant carriers in men with aggressive PrCa exceeded that observed in men with non-aggressive PrCa; 18/787 carriers (2.3%) and 4/769 carriers (0.5%), P = 0.004, and 14/787 carriers (0.02%) and 5/769 carriers (0.01%), P = 0.06, respectively. Our findings contribute to the extensive international effort to interpret the genetic variation identified in genes included on gene-panel tests, for which there is currently an insufficient evidence-base for clinical translation in the context of PrCa risk.

Mismatch repair gene pathogenic germline variants in a population-based cohort of breast cancer, Familial Cancer, 2020.

Authors

Tu Nguyen‑Dumont, Jason A. Steen, Ingrid Winship, Daniel J. Park, Bernard J. Pope, Fleur Hammet, Maryam Mahmoodi, Helen Tsimiklis, Derrick Theys, Mark Clendenning, Graham G. Giles, John L. Hopper, Melissa C. Southey

Journal

Familial Cancer

Year

2020

DOI

10.1007/s10689-020-00164-7

Pubmed ID

32060697

URL

https://doi.org/10.1007/s10689-020-00164-7

Abstract

The advent of gene panel testing is challenging the previous practice of using clinically defined cancer family syndromes to inform single-gene genetic screening. Individual and family cancer histories that would have previously indicated testing of a single gene or a small number of related genes are now, increasingly, leading to screening across gene panels that contain larger numbers of genes. We have applied a gene panel test that included four DNA mismatch repair (MMR) genes (MLH1, MSH2, MSH6 and PMS2) to an Australian population-based case–control-family study of breast cancer. Altogether, eight pathogenic variants in MMR genes were identified: six in 1421 case-families (0.4%, 4 MSH6 and 2 PMS2) and two in 833 control-families (0.2%, one each of MLH1 and MSH2). This testing highlights the current and future challenges for clinical genetics in the context of anticipated gene panel-based population-based screening that includes the MMR genes. This test-ing is likely to provide additional opportunities for cancer prevention via cascade testing for Lynch syndrome and precision medicine for breast cancer treatment.

Bionitio: demonstrating and facilitating best practices for bioinformatics command-line software, GigaScience, 2019.

Authors

Peter Georgeson, Anna Syme, Clare Sloggett, Jessica Chung, Harriet Dashnow, Michael Milton, Andrew Lonsdale, David Powell, Torsten Seemann, Bernard Pope

Journal

GigaScience

Volume

8

Issue

9

Year

2019

DOI

10.1093/gigascience/giz109

URL

https://doi.org/10.1093/gigascience/giz109

Abstract

Background. Bioinformatics software tools are often created ad hoc, frequently by people without extensive training in software development. In particular, for beginners, the barrier to entry in bioinformatics software development is high, especially if they want to adopt good programming practices. Even experienced developers do not always follow best practices. This results in the proliferation of poorer-quality bioinformatics software, leading to limited scalability and inefficient use of resources; lack of reproducibility, usability, adaptability, and interoperability; and erroneous or inaccurate results.
Findings. We have developed Bionitio, a tool that automates the process of starting new bioinformatics software projects following recommended best practices. With a single command, the user can create a new well-structured project in 1 of 12 programming languages. The resulting software is functional, carrying out a prototypical bioinformatics task, and thus serves as both a working example and a template for building new tools. Key features include command-line argument parsing, error handling, progress logging, defined exit status values, a test suite, a version number, standardized building and packaging, user documentation, code documentation, a standard open source software license, software revision control, and containerization.
Conclusions. Bionitio serves as a learning aid for beginner-to-intermediate bioinformatics programmers and provides an excellent starting point for new projects. This helps developers adopt good programming practices from the beginning of a project and encourages high-quality tools to be developed more rapidly. This also benefits users because tools are more easily installed and consistent in their usage. Bionitio is released as open source software under the MIT License and is available at https://github.com/bionitio-team/bionitio

Tumor mutational signatures in sebaceous skin lesions from individuals with Lynch syndrome, Molecular Genetics & Genomic Medicine, 2019.

Authors

Georgeson P, Walsh MD, Clendenning M, Daneshvar S, Pope BJ, Mahmood K, Joo JE, Jayasekara H, Jenkins M, Winship IM, Buchanan DD

Journal

Molecular Genetics & Genomic Medicine

Volume

7

Issue

7

Year

2019

DOI

10.1002/mgg3.781

Pubmed ID

31162827

URL

https://doi.org/10.1002/mgg3.781

Keywords

Lynch syndrome; Muir-Torre syndrome; mismatch repair deficiency; mismatch repair immunohistochemistry; sebaceoma; sebaceous adenoma; sebaceous carcinoma

Abstract

BACKGROUND: Muir-Torre syndrome is defined by the development of sebaceous skin lesions in individuals who carry a germline mismatch repair (MMR) gene mutation. Loss of expression of MMR proteins is frequently observed in sebaceous skin lesions, but MMR-deficiency alone is not diagnostic for carrying a germline MMR gene mutation.
METHODS: Whole exome sequencing was performed on three MMR-deficient sebaceous lesions from individuals with MSH2 gene mutations (Lynch syndrome) and three MMR-proficient sebaceous lesions from individuals without Lynch syndrome with the aim of characterizing the tumor mutational signatures, somatic mutation burden, and microsatellite instability status. Thirty predefined somatic mutational signatures were calculated for each lesion.
RESULTS: Signature 1 was ubiquitous across the six lesions tested. Signatures 6 and 15, associated with defective DNA MMR, were significantly more prevalent in the MMR-deficient lesions from the MSH2 carriers compared with the MMR-proficient non-Lynch sebaceous lesions (mean ± SD=41.0 ± 8.2% vs. 2.3 ± 4.0%, p = 0.0018). Tumor mutation burden was, on average, significantly higher in the MMR-deficient lesions compared with the MMR-proficient lesions (23.3 ± 11.4 vs. 1.8 ± 0.8 mutations/Mb, p = 0.03). All four sebaceous lesions observed in sun exposed areas of the body demonstrated signature 7 related to ultraviolet light exposure.
CONCLUSION: Tumor mutational signatures 6 and 15 and somatic mutation burden were effective in differentiating Lynch-related from non-Lynch sebaceous lesions.

Hi-Plex2: a simple and robust approach to targeted sequencing-based genetic screening, BioTechniques, 2019.

Authors

Fleur Hammet, Khalid Mahmood, Thomas R Green, Tu Nguyen-Dumont, Melissa C Southey, Daniel D Buchanan, Andrew Lonie, Katherine L Nathanson, Fergus J Couch, Bernard J Pope, Daniel J Park

Journal

BioTechniques

Year

2019

DOI

10.2144/btn-2019-0026

Pubmed ID

31267764

URL

http://dx.doi.org/10.2144/btn-2019-0026

Keywords

genetic screening, Hi-Plex, multiplex PCR, targeted DNA sequencing, variant detection

Abstract

We have previously reported Hi-Plex, a multiplex PCR methodology for building targeted DNA sequencing libraries that offers a low-cost protocol compatible with high-throughput processing. Here, we detail an improved protocol, Hi-Plex2, that more effectively enables the robust construction of small-to-medium panel-size libraries while maintaining low cost, simplicity and accuracy benefits of the Hi-Plex platform. Hi-Plex2 was applied to three panels, comprising 291, 740 and 1193 amplicons, targeting genes associated with risk for breast and/or colon cancer. We show substantial reduction of off-target amplification to enable library construction for small-to-medium-sized design panels not possible using the previous Hi-Plex chemistry.

Annotation of the Giardia proteome through structure-based homology and machine learning, Gigascience, 2019.

Authors

Brendan R E Ansell, Bernard J Pope, Peter Georgeson, Samantha J Emery-Corbin, Aaron R Jex

Journal

Gigascience

Volume

8

Issue

1

Year

2019

DOI

10.1093/gigascience/giy150

Pubmed ID

30520990

URL

http://dx.doi.org/10.1093/gigascience/giy150

Abstract

Background
Large-scale computational prediction of protein structures represents a cost-effective alternative to empirical structure determination with particular promise for non-model organisms and neglected pathogens. Conventional sequence-based tools are insufficient to annotate the genomes of such divergent biological systems. Conversely, protein structure tolerates substantial variation in primary amino acid sequence, and is thus a robust indicator of biochemical function. Structural proteomics is poised to become a standard part of pathogen genomics research, however informatic methods are now required to assign confidence in large volumes of predicted structures.
Aims
To predict the proteome of a neglected human pathogen, Giardia duodenalis, and stratify predicted structures into high- and lower-confidence categories using a variety of metrics in isolation and combination.
Methods
We used the I-TASSER suite to predict structural models for ∼5000 proteins encoded in Giardia duodenalis and identify their closest empirically determined structural homologues in the Protein Data Bank. Models were assigned to high or lower-confidence categories depending on the presence of matching PFAM domains in query and reference peptides. Metrics output from the suite and derived metrics were assessed for their ability to predict the high confidence category individually, and in combination through development of a random forest classifier.
Results
We identified 1095 high confidence models including 212 hypothetical proteins. Amino acid identity between query and reference peptides was the greatest individual predictor of high confidence status, however the random forest classifier out-performed any metric in isolation (AUC = 0.977), and identified a subset of 305 high confidence-like models, corresponding to false positive predictions. High confidence models exhibited higher transcriptional abundance, and the classifier generalized across species, indicating the broad utility of this approach for automatically stratifying predicted structures. Additional structure-based clustering was used to cross-check confidence predictions in an expanded family of Nek kinases. Several high confidence-like proteins yielded substantial new insight into mechanisms of redox balance in Giardia duodenalis—a system central to the efficacy of limited anti-giardial drugs.
Conclusion
Structural proteomics combined with machine learning can aid genome annotation for genetically divergent organisms including human pathogens, and stratify predicted structures to promote efficient allocation of limited resources for experimental investigation.

Reduced familial fertility in carriers of mutations in the BRCA1 and BRCA2 genes, European Journal of Human Genetics, 2019.

Authors

Akopyan H, Kitsera N, Siekierzynska A, Nguyen-Dumont T, Hammet F, Tsimiklis H, Park DJ, Pope BJ, Southey MC, Blonioarz D, Myszka A

Journal

European Journal of Human Genetics

Volume

27

Year

2019

Keywords

Science & Technology, Life Sciences & Biomedicine, Biochemistry & Molecular Biology, Genetics & Heredity

Frequency of BRCA1 and BRCA2 Germline Variants in Women With Ovarian Cancer in Malaysia, Journal of Global Oncology, 2018.

Authors

Lim J, Lau SY, Bashah NS Ahmad, Lai KN, Wen WX; Hasan SN, Park DJ, Pope BJ, Nguyen-Dumont T, Southey MC, Rahman N, Woo YL, Thong MK, Ch'ng GS, Teo SH, Yoon SY

Journal

Journal of Global Oncology

Volume

4

Issue

Supplement 2

Year

2018

DOI

10.1200/jgo.18.50600

URL

http://dx.doi.org/10.1200/jgo.18.50600

Abstract

Background: Germline BRCA1 or BRCA2 pathogenic variants in ovarian cancer patients may be informative in risk management and treatment, with the advent of poly(ADP-ribose) polymerase inhibitors. In the era of precision medicine, companion diagnostics for BRCA1 and BRCA2 genes have been featured as a strategy in the Malaysia National Strategic Plan for Cancer Control Program (2016-2020). To facilitate this strategy, frequency data from Malaysia's understudied multiethnic population will be required. Aim: To determine the prevalence of BRCA1 and BRCA2 germline variants in a population-based cohort of ovarian cancer patients in Malaysia. Methods: From August 2016, women with nonmucinous epithelial ovarian, peritoneal or fallopian tube carcinoma are prospectively recruited to the Malaysia-wide population-based MaGiC Observational Study. DNA were tested using a Hi-Plex next generation sequencing method and multiplex ligation-dependent probe amplification to detect < 10 bp alterations and exon deletions or duplications in the BRCA1 and BRCA2 genes. Results: Interim results from 325 patients tested until March 2018 have identified BRCA1 and BRCA2 pathogenic variants in 9.8% (32/325) and 3.1% (10/325) patients, respectively. Variants of uncertain significance were detected in 13.2% (43/325) patients and no pathogenic variants were detected in 73.8% (240/325) patients. Taken together, the frequency of BRCA1/2 pathogenic variants in ovarian cancer patients is approximately 12.9% (42/325). Conclusion: The identification of BRCA1 or BRCA2 carriers across the country have enabled the concentration of efforts from limited genetic counseling resources to high risk families. Results arising from the completion of this study will supplement cancer control programs and genetic services in Malaysia.

sEst: Accurate Sex-Estimation and Abnormality Detection in Methylation Microarray Data, International Journal of Molecular Sciences, 2018.

Authors

Chol-Hee Jung, Daniel J. Park, Peter Georgeson, Khalid Mahmood, Roger L. Milne, Melissa C. Southey, Bernard J. Pope

Journal

International Journal of Molecular Sciences

Volume

19

Issue

10

Year

2018

DOI

10.3390/ijms19103172

Pubmed ID

30326623

URL

http://dx.doi.org/10.3390/ijms19103172

Keywords

DNA methylation, sex information, sex-chromosome abnormalities, epigenetics

Abstract

DNA methylation influences predisposition, development and prognosis for many diseases, including cancer. However, it is not uncommon to encounter samples with incorrect sex labelling or atypical sex chromosome arrangement. Sex is one of the strongest influencers of the genomic distribution of DNA methylation and, therefore, correct assignment of sex and filtering of abnormal samples are essential for the quality control of study data. Differences in sex chromosome copy numbers between sexes and X-chromosome inactivation in females result in distinctive sex-specific patterns in the distribution of DNA methylation levels. In this study, we present a software tool, sEst, which incorporates clustering analysis to infer sex and to detect sex-chromosome abnormalities from DNA methylation microarray data. Testing with two publicly available datasets demonstrated that sEst not only correctly inferred the sex of the test samples, but also identified mislabelled samples and samples with potential sex-chromosome abnormalities, such as Klinefelter syndrome and Turner syndrome, the latter being a feature not offered by existing methods. Considering that sex and the sex-chromosome abnormalities can have large effects on many phenotypes, including diseases, our method can make a significant contribution to DNA methylation studies that are based on microarray platforms.

Is RNASEL:p.Glu265* a modifier of early-onset breast cancer risk for carriers of high-risk mutations?, BMC Cancer, 2018.

Authors

Tú Nguyen-Dumont, Zhi L. Teo, Fleur Hammet, Alexis Roberge, Maryam Mahmoodi, Helen Tsimiklis, Daniel J. Park, Bernard J. Pope, Andrew Lonie, Miroslav K. Kapuscinski, Khalid Mahmood, ABCFR, David E. Goldgar, Graham G. Giles, Ingrid Winship, John L. Hopper, Melissa C. Southey

Journal

BMC Cancer

Volume

18

Issue

165

Year

2018

DOI

10.1186/s12885-018-4028-z

Pubmed ID

29422015

URL

http://dx.doi.org/10.1186/s12885-018-4028-z

Keywords

RNASEL:P.Glu265*, Breast cancer, Modifier risk gene, Early-onset cancer

Abstract

Background
Breast cancer risk for BRCA1 and BRCA2 pathogenic mutation carriers is modified by risk factors that cluster in families, including genetic modifiers of risk. We considered genetic modifiers of risk for carriers of high-risk mutations in other breast cancer susceptibility genes.
Methods
In a family known to carry the high-risk mutation PALB2:c.3113G>A (p.Trp1038*), whole-exome sequencing was performed on germline DNA from four affected women, three of whom were mutation carriers.
Results
RNASEL:p.Glu265* was identified in one of the PALB2 carriers who had two primary invasive breast cancer diagnoses before 50 years. Gene-panel testing of BRCA1, BRCA2, PALB2 and RNASEL in the Australian Breast Cancer Family Registry identified five carriers of RNASEL:p.Glu265* in 591 early onset breast cancer cases. Three of the five women (60%) carrying RNASEL:p.Glu265* also carried a pathogenic mutation in a breast cancer susceptibility gene compared with 30 carriers of pathogenic mutations in the 586 non-carriers of RNASEL:p.Glu265* (5%) (p < 0.002). Taqman genotyping demonstrated that the allele frequency of RNASEL:p.Glu265* was similar in affected and unaffected Australian women, consistent with other populations.
Conclusion
Our study suggests that RNASEL:p.Glu265* may be a genetic modifier of risk for early-onset breast cancer predisposition in carriers of high-risk mutations. Much larger case-case and case-control studies are warranted to test the association observed in this report.

FANCM and RECQL genetic variants and breast cancer susceptibility: relevance to South Poland and West Ukraine, BMC Medical Genetics, 2018.

Authors

T Nguyen-Dumont, A Myszka, P Karpinski, M Sasiadek, H Akopyan, F Hammet, H Tsimiklis, D Park, B Pope, R Slezak, N Kitsera, A Siekierzynska, M Southey

Journal

BMC Medical Genetics

Volume

19

Issue

12

Year

2018

DOI

10.1186/s12881-018-0524-x

Pubmed ID

29351780

URL

http://dx.doi.org/10.1186/s12881-018-0524-x

Keywords

FANCM, RECQL, Breast cancer predisposition, Familial breast cancer, Gene panel testing

Abstract

Background. FANCM and RECQL have recently been reported as breast cancer susceptibility genes and it has been suggested that they should be included on gene panel tests for breast cancer predisposition. However, the clinical value of testing for mutations in RECQL and FANCM remains to be determined. In this study, we have characterised the spectrum of FANCM and RECQL mutations in women affected with breast or ovarian cancer from South-West Poland and West Ukraine. Methods. We applied Hi-Plex, an amplicon-based enrichment method for targeted massively parallel sequencing, to screen the coding exons and proximal intron-exon junctions of FANCM and RECQL in germline DNA from unrelated women affected with breast cancer (n = 338) and ovarian cancer (n = 89) from Poland (n = 304) and Ukraine (n = 123). These women were at high-risk of carrying a genetic predisposition to breast and/or ovarian cancer due to a family history and/or early-onset disease. Results. Among 427 women screened, we identified one carrier of the FANCM:c.1972C > T nonsense mutation (0.23%), and two carriers of the frameshift insertion FANCM:c.1491dup (0.47%). None of the variants we observed in RECQL were predicted to be loss-of-function mutations by standard variant effect prediction tools. Conclusions. Our study of the Polish and Ukrainian populations has identified a carrier frequency of truncating mutations in FANCM consistent with previous reports. Although initial reports suggesting that mutations in RECQL could be associated with increased breast cancer risk included women from Poland and identified the RECQL:c.1667_1667 + 3delAGTA mutation in 0.23–0.35% of breast cancer cases, we did not observe any carriers in our study cohort. Continued screening, both in research and diagnostic settings, will enable the accumulation of data that is needed to establish the clinical utility of including RECQL and FANCM on gene panel tests.

Hi-Plex for Simple, Accurate, and Cost-Effective Amplicon-based Targeted DNA Sequencing, Methods in Molecular Biology, 2018.

Authors

Pope B, Hammet, Nguyen-Dumont, Park D

Journal

Methods in Molecular Biology

Volume

1712

Year

2018

DOI

10.1007/978-1-4939-7514-3_5

Pubmed ID

29224068

URL

http://dx.doi.org/10.1007/978-1-4939-7514-3_5

Abstract

Hi-Plex is a suite of methods to enable simple, accurate, and cost-effective highly multiplex PCR-based targeted sequencing (Nguyen-Dumont et al., Biotechniques 58:33-36, 2015). At its core is the principle of using gene-specific primers (GSPs) to "seed" (or target) the reaction and universal primers to "drive" the majority of the reaction. In this manner, effects on amplification efficiencies across the target amplicons can, to a large extent, be restricted to early seeding cycles. Product sizes are defined within a relatively narrow range to enable high-specificity size selection, replication uniformity across target sites (including in the context of fragmented input DNA such as that derived from fixed tumor specimens (Nguyen-Dumont et al., Biotechniques 55:69-74, 2013; Nguyen-Dumont et al., Anal Biochem 470:48-51, 2015), and application of high-specificity genetic variant calling algorithms (Pope et al., Source Code Biol Med 9:3, 2014; Park et al., BMC Bioinformatics 17:165, 2016). Hi-Plex offers a streamlined workflow that is suitable for testing large numbers of specimens without the need for automation.

Double somatic mutations as a cause of tumor mismatch repair-deficiency in population-based colorectal and endometrial cancer with Lynch-like syndrome, Cancer Research, 2017.

Authors

Buchanan DD, Clendenning M, Jayasekara H, Joo JE, Wong EM, Southey MC, Walters RJ, Pope BJ, Win AK, Hopper JL, Jenkins MA, Milne RL, Giles GG, English DR, Macrae FA, Spurdle AB, Winship IM, Rosty C

Journal

Cancer Research

Volume

77

Year

2017

DOI

10.1158/1538-7445.AM2017-4266

URL

http://dx.doi.org/10.1158/1538-7445.AM2017-4266

Keywords

Science & Technology, Life Sciences & Biomedicine, Oncology

Risk of colorectal cancer for carriers of a germline mutation in POLE or POLD1, Genetics in Medicine, 2017.

Authors

Buchanan D, Stewart J, Clendenning M, Rosty C, Mahmood K, Pope B, Jenkins M, Hopper J, Southey M, Macrae F, Winship M, Win A

Journal

Genetics in Medicine

Year

2017

DOI

10.1038/gim.2017.185

Pubmed ID

29120461

URL

http://dx.doi.org/10.1038/gim.2017.185

Abstract

Background: Germline mutations in the exonuclease domains of the POLE and POLD1 genes are associated with an as yet unquantified increased risk of colorectal cancer (CRC). Methods: We identified families with POLE or POLD1 variants by searching PubMed for relevant studies prior to October 2016 and by genotyping 669 population-based CRC cases diagnosed <60 years of age from the Australasian Colorectal Cancer Family Registry. We estimated the age-specific cumulative risks (penetrance) using a modified segregation analysis. Results: We observed 67 CRCs (mean age at diagnosis=50.2 (standard deviation [SD]=13.8) years) among 364 first- and second- degree relatives from 41 POLE families and 6 CRCs (mean age at diagnosis=39.7 (SD=6.83) years) among 69 relatives from 9 POLD1 families. We estimated risks of CRC to age 70 years (95% confidence interval [CI]) for males and females, respectively, to be: 40%(26%–57%) and 32%(20%–47%) for POLE mutation carriers; and 63%(15%–99%) and 52%(11%–99%) for POLD1 mutation carriers. Conclusion: CRC risks for POLE mutation carriers are sufficiently high warranting consideration of annual colonoscopy screening and management guidelines comparable to Lynch syndrome. Refinement of estimates of CRC risk for POLD1 carriers is needed, however, clinical management recommendations could follow those suggested for POLE carriers.

Targeted massively parallel sequencing characterises the mutation spectrum of PALB2 in breast and ovarian cancer cases from Poland and Ukraine, Familial Cancer, 2017.

Authors

Myszka A, Nguyen-Dumont T, Karpinski P, Sasiadek MM, Akopyan H, Hammet F, Tsimiklis H, Park DJ, Pope BJ, Slezak R, Kitsera N, Siekierzynska A, Southey MC

Journal

Familial Cancer

Year

2017

DOI

10.1007/s10689-017-0050-6

Pubmed ID

29052111

URL

http://dx.doi.org/10.1007/s10689-017-0050-6

Keywords

PALB2; Breast cancer; Ovarian cancer; Genetic susceptibility; Massively parallel sequencing

Abstract

Loss-of-function germline mutations in the PALB2 gene are associated with an increase of breast cancer risk. The purpose of this study was to characterise the spectrum of PALB2 mutations in women affected with breast or ovarian cancer from South-West Poland and West Ukraine. We applied Hi-Plex, an amplicon-based enrichment method for targeted massively parallel sequencing, to screen the coding exons and proximal intron–exon junctions of PALB2 in germline DNA from unrelated women affected with breast cancer (n = 338) and ovarian cancer (n = 89) from Poland (n = 304) and Ukraine (n = 123). These women were at high-risk of carrying a genetic predisposition to breast and/or ovarian cancer due to a family history and/or early-onset disease. Targeted-sequencing identified two frameshift deletions: PALB2:c.509_510del; p.R170Ifs in three women affected with breast cancer and PALB2:c.172_175del;p.Q60Rfs in one woman affected with ovarian cancer. A number of other previously described missense (some predicted to be damaging by PolyPhen-2 and CADD) and synonymous mutations were also identified in this population. This study is consistent with previous reports that PALB2:c.509_510del and PALB2:c.172_175del are recurrent mutations associated with breast cancer predisposition in Polish women with a family history of the disease. Our study contributes to the accumulating evidence indicating that PALB2 should be included in genetic testing for breast cancer susceptibility in these populations to enhance risk assessment and management of women at high-risk of developing breast cancer. This data could also contribute to ongoing work that is assessing the possible association between ovarian cancer risk and PALB2 mutations for which there is currently no evidence.

A novel Drosophila injury model reveals severed axons are cleared through a Draper/MMP-1 signaling cascade, eLife, 2017.

Authors

Purice MD, Ray A, Münzel EJ, Pope BJ, Park DJ, Speese SD, Logan MA

Journal

eLife

Volume

6

Year

2017

DOI

10.7554/eLife.23611

Pubmed ID

28825401

URL

http://dx.doi.org/10.7554/eLife.23611

Keywords

glial immunity; Wallerian degeneration; extracellular matrix; Draper; matrix metalloproteinase

Abstract

Neural injury triggers swift responses from glia, including glial migration and phagocytic clearance of damaged neurons. The transcriptional programs governing these complex innate glial immune responses are still unclear. Here, we describe a novel injury assay in adult Drosophila that elicits widespread glial responses in the ventral nerve cord (VNC). We profiled injury-induced changes in VNC gene expression by RNA sequencing (RNA-seq) and found that responsive genes fall into diverse signaling classes. One factor, matrix metalloproteinase-1 (MMP-1), is induced in Drosophila ensheathing glia responding to severed axons. Interestingly, glial induction of MMP-1 requires the highly conserved engulfment receptor Draper, as well as AP-1 and STAT92E. In MMP-1 depleted flies, glia do not properly infiltrate neuropil regions after axotomy and, as a consequence, fail to clear degenerating axonal debris. This work identifies Draper-dependent activation of MMP-1 as a novel cascade required for proper glial clearance of severed axons.

Somatic mutations of the coding microsatellites within the beta-2-microglobulin gene in mismatch repair-deficient colorectal cancers and adenomas, Familial Cancer, 2017.

Authors

Clendenning M, Huang A, Jayasekara H, Lorans M, Preston S, O'Callaghan N, Pope BJ, Macrae FA, Winship IM, Milne RL, Giles GG, English DR, Hopper JL, Win AK, Jenkins MA, Southey MC, Rosty C, Buchanan DD, investigators from the Melbourne Collaborative Cohort Study and the Australasian Colorectal Cancer Family Registry Cohort

Journal

Familial Cancer

Year

2017

DOI

10.1007/s10689-017-0013-y

Pubmed ID

28616688

URL

http://dx.doi.org/10.1007/s10689-017-0013-y

Keywords

B2M; Colorectal cancer; Lynch syndrome; MLH1 methylation; Microsatellite instability; Mismatch repair deficiency

Abstract

In colorectal cancers (CRCs) with tumour mismatch repair (MMR) deficiency, genes involved in the host immune response that contain microsatellites in their coding regions, including beta-2-microglobulin (B2M), can acquire mutations that may alter the immune response, tumour progression and prognosis. We screened the coding microsatellites within B2M for somatic mutations in MMR-deficient CRCs and adenomas to determine associations with tumour subtypes, clinicopathological features and survival. Incident MMR-deficient CRCs from Australasian Colorectal Cancer Family Registry (ACCFR) and the Melbourne Collaborative Cohort Study participants (n = 144) and 63 adenomas from 41 MMR gene mutation carriers from the ACCFR were screened for somatic mutations within five coding microsatellites of B2M. Hazard ratios (HR) and 95% confidence intervals (CI) for overall survival by B2M mutation status were estimated using Cox regression, adjusting for age at CRC diagnosis, sex, AJCC stage and grade. B2M mutations occurred in 30 (20.8%) of the 144 MMR-deficient CRCs (29% of the MLH1-methylated, 17% of the Lynch syndrome and 9% of the suspected Lynch CRCs). No B2M mutations were identified in the 63 adenomas tested. B2M mutations differed by site, stage, grade and lymphocytic infiltration although none reached statistical significance (p > 0.05). The HR for overall survival for B2M mutated CRC was 0.65 (95% CI 0.29-1.48) compared with B2M wild-type. We observed differences in B2M mutation status in MMR-deficient CRC by tumour subtypes, site, stage, grade, immune infiltrate and for overall survival that warrant further investigation in larger studies before B2M mutation status can be considered to have clinical utility.

Four simple recommendations to encourage best practices in research software, F1000 Research, 2017.

Authors

Rafael C. Jiménez, Mateusz Kuzak, Monther Alhamdoosh, Michelle Barker, Bérénice Batut, Mikael Borg, Salvador Capella-Gutierrez, Neil Chue Hong, Martin Cook, Manuel Corpas, Madison Flannery, Leyla Garcia, Josep Ll. Gelpí, Simon Gladman, Carole Goble, Montserrat González Ferreiro, Alejandra Gonzalez-Beltran, Philippa C. Griffin, Björn Grüning, Jonas Hagberg, Petr Holub, Rob Hooft, Jon Ison, Daniel S. Katz, Brane Leskošek, Federico López Gómez, Luis J. Oliveira, David Mellor, Rowland Mosbergen, Nicola Mulder, Yasset Perez-Riverol, Robert Pergl, Horst Pichler, Bernard Pope, Ferran Sanz, Maria V. Schneider, Victoria Stodden, Radosław Suchecki, Radka Svobodová Vařeková, Harry-Anton Talvik, Ilian Todorov, Andrew Treloar, Sonika Tyagi, Maarten van Gompel, Daniel Vaughan, Allegra Via, Xiaochuan Wang, Nathan S. Watson-Haigh, Steve Crouch

Journal

F1000 Research

Volume

13

Issue

6

Year

2017

DOI

10.12688/f1000research.11407.1

Pubmed ID

28751965

URL

http://dx.doi.org/10.12688/f1000research.11407.1

Keywords

FAIR; Open Science; Open Source; best practices; code; guidelines; quality; recommendations; software; sustainability

Abstract

Scientific research relies on computer software, yet software is not always developed following practices that ensure its quality and sustainability. This manuscript does not aim to propose new software development best practices, but rather to provide simple recommendations that encourage the adoption of existing best practices. Software development best practices promote better quality software, and better quality software improves the reproducibility and reusability of research. These recommendations are designed around Open Source values, and provide practical suggestions that contribute to making research software and its source code more discoverable, reusable and transparent. This manuscript is aimed at developers, but also at organisations, projects, journals and funders that can increase the quality and sustainability of research software by encouraging the adoption of these recommendations.

Best practice data life cycle approaches for the life sciences, F1000 Research, 2017.

Authors

Griffin PC, Khadake J, LeMay KS, Lewis SE, Orchard S, Pask A, Pope B, Roessner U, Russell K, Seemann T, Treloar A, Tyagi S, Christiansen JH, Dayalan S, Gladman S, Hangartner SB, Hayden HL, Ho WWH, Keeble-Gagnère G, Korhonen PK, Neish P, Prestes PR, Richardson MF, Watson-Haigh NS, Wyres KL, Young ND, Schneider MV

Journal

F1000 Research

Year

2017

DOI

10.12688/f1000research.12344.1

URL

http://dx.doi.org/10.12688/f1000research.12344.1

Abstract

Throughout history, the life sciences have been revolutionised by technological advances; in our era this is manifested by advances in instrumentation for data generation, and consequently researchers now routinely handle large amounts of heterogeneous data in digital formats. The simultaneous transitions towards biology as a data science and towards a ‘life cycle’ view of research data pose new challenges. Researchers face a bewildering landscape of data management requirements, recommendations and regulations, without necessarily being able to access data management training or possessing a clear understanding of practical approaches that can assist in data management in their particular research domain.
Here we provide an overview of best practice data life cycle approaches for researchers in the life sciences/bioinformatics space with a particular focus on ‘omics’ datasets and computer-based data processing and analysis. We discuss the different stages of the data life cycle and provide practical suggestions for useful tools and resources to improve data management practices.

Variant effect prediction tools assessed using independent, functional assay-based datasets: implications for discovery and diagnostics, Human Genomics, 2017.

Authors

Mahmood K, Jung CH, Philip G, Georgeson P, Chung J, Pope BJ, Park DJ

Journal

Human Genomics

Volume

11

Issue

1

Year

2017

DOI

10.1186/s40246-017-0104-8

Pubmed ID

28511696

URL

http://dx.doi.org/10.1186/s40246-017-0104-8

Keywords

Benchmarking; Functional assays; Functional datasets; Genomic screening; Mutation assessment; Pathogenicity prediction; Protein function; Variant effect prediction

Abstract

BACKGROUND: Genetic variant effect prediction algorithms are used extensively in clinical genomics and research to determine the likely consequences of amino acid substitutions on protein function. It is vital that we better understand their accuracies and limitations because published performance metrics are confounded by serious problems of circularity and error propagation. Here, we derive three independent, functionally determined human mutation datasets, UniFun, BRCA1-DMS and TP53-TA, and employ them, alongside previously described datasets, to assess the pre-eminent variant effect prediction tools. RESULTS: Apparent accuracies of variant effect prediction tools were influenced significantly by the benchmarking dataset. Benchmarking with the assay-determined datasets UniFun and BRCA1-DMS yielded areas under the receiver operating characteristic curves in the modest ranges of 0.52 to 0.63 and 0.54 to 0.75, respectively, considerably lower than observed for other, potentially more conflicted datasets. CONCLUSIONS: These results raise concerns about how such algorithms should be employed, particularly in a clinical setting. Contemporary variant effect prediction tools are unlikely to be as accurate at the general prediction of functional impacts on proteins as reported prior. Use of functional assay-based datasets that avoid prior dependencies promises to be valuable for the ongoing development and accurate benchmarking of such tools.

Mutation screening of ACKR3 and COPS8 in kidney cancer cases from the CONFIRM study, Familial Cancer, 2017.

Authors

Mahmoodi M, Nguyen-Dumont T, Hammet F, Pope BJ, Park DJ, Southey MC, Darlow JM, Bruinsma F, Winship I

Journal

Familial Cancer

Year

2017

DOI

10.1007/s10689-016-9961-x

Pubmed ID

28063109

URL

http://dx.doi.org/10.1007/s10689-016-9961-x

Keywords

ACKR3; COPS8; Hi-Plex; Kidney cancer; Massively parallel sequencing; Mutation screening

Abstract

An apparently balanced t(2;3)(q37.3;q13.2) translocation that appears to segregate with renal cell carcinoma (RCC) has indicated potential areas to search for the elusive genetic basis of clear cell RCC. We applied Hi-Plex targeted sequencing to analyse germline DNA from 479 individuals affected with clear cell RCC for this breakpoint translocation and genetic variants in neighbouring genes on chromosome 2, ACKR3 and COPS8. While only synonymous variants were found in COPS8, one of the missense variants in ACKR3:c.892C>T, observed in 4/479 individuals screened (0.8%), was predicted likely to damage ACKR3 function. Identification of causal genes for RCC has potential clinical utility, where risk assessment and risk management can offer better outcomes, with surveillance for at-risk relatives and nephron sparing surgery through earlier intervention.

Single nucleotide-level mapping of DNA double-strand breaks in human HEK293T cells, Genomics Data, 2017.

Authors

Pope BJ, Mahmood K, Jung CH, Georgeson P, Park DJ

Journal

Genomics Data

Volume

11

Year

2017

DOI

10.1016/j.gdata.2016.11.007

Pubmed ID

27942458

URL

http://dx.doi.org/10.1016/j.gdata.2016.11.007

Keywords

Double-strand breaks; Fragile sites; Human genome; Forum domains; HEK293T

Abstract

Constitutional biological processes involve the generation of DNA double-strand breaks (DSBs). The production of such breaks and their subsequent resolution are also highly relevant to neurodegenerative diseases and cancer, in which extensive DNA fragmentation has been described Stephens et al. (2011), Blondet et al. (2001). Tchurikov et al. Tchurikov et al. (2011, 2013) have reported previously that frequent sites of DSBs occur in chromosomal domains involved in the co-ordinated expression of genes. This group report that hot spots of DSBs in human HEK293T cells often coincide with H3K4me3 marks, associated with active transcription Kravatsky et al. (2015) and that frequent sites of DNA double-strand breakage are likely to be relevant to cancer genomics Tchurikov et al. (2013, 2016) . Recently, they applied a RAFT (rapid amplification of forum termini) protocol that selects for blunt-ended DSB sites and mapped these to the human genome within defined co-ordinate 'windows'. In this paper, we re-analyse public RAFT data to derive sites of DSBs at the single-nucleotide level across the built genome for human HEK293T cells (https://figshare.com/s/35220b2b79eaaaf64ed8). This refined mapping, combined with accessory ENCODE data tracks and ribosomal DNA-related sequence annotations, will likely be of value for the design of clinically relevant targeted assays such as those for cancer susceptibility, diagnosis, treatment-matching and prognostication.

MethPat: a tool for the analysis and visualisation of complex methylation patterns obtained by massively parallel sequencing, BMC Bioinformatics, 2016.

Authors

Wong NC, Pope BJ, Candiloro IL, Korbie D, Trau M, Wong SQ, Mikeska T, Zhang X, Pitman M, Eggers S, Doyle SR, Dobrovic A

Journal

BMC Bioinformatics

Volume

17

Year

2016

DOI

10.1186/s12859-016-0950-8

Pubmed ID

26911705

URL

http://dx.doi.org/10.1186/s12859-016-0950-8

Abstract

BACKGROUND: DNA methylation at a gene promoter region has the potential to regulate gene transcription. Patterns of methylation over multiple CpG sites in a region are often complex and cell type specific, with the region showing multiple allelic patterns in a sample. This complexity is commonly obscured when DNA methylation data is summarised as an average percentage value for each CpG site (or aggregated across CpG sites). True representation of methylation patterns can only be fully characterised by clonal analysis. Deep sequencing provides the ability to investigate clonal DNA methylation patterns in unprecedented detail and scale, enabling the proper characterisation of the heterogeneity of methylation patterns. However, the sheer amount and complexity of sequencing data requires new synoptic approaches to visualise the distribution of allelic patterns. RESULTS: We have developed a new analysis and visualisation software tool Methpat.

Fine resolution mapping of double-strand break sites for human ribosomal DNA units, Genomics Data, 2016.

Authors

Pope BJ, Mahmood K, Jung CH, Park DJ

Journal

Genomics Data

Volume

10

Year

2016

DOI

10.1016/j.gdata.2016.08.012

Pubmed ID

27656414

URL

http://dx.doi.org/10.1016/j.gdata.2016.08.012

Keywords

Double-strand breaks; Forum domains; Fragile sites; HEK293T; rDNA

Abstract

DNA breakage arises during a variety of biological processes, including transcription, replication and genome rearrangements. In the context of disease, extensive fragmentation of DNA has been described in cancer cells and during early stages of neurodegeneration (Stephens et al., 2011 Stephens et al. (2011) [5]; Blondet et al., 2001 Blondet et al. (2001) [1]). Stults et al. (2009) Stults et al. (2009) [6] reported that human rDNA gene clusters are hotspots for recombination and that rDNA restructuring is among the most common chromosomal alterations in adult solid tumours. As such, analysis of rDNA regions is likely to have significant prognostic and predictive value, clinically. Tchurikov et al. (2015a, 2016) Tchurikov et al. (2015a, 2016) [7], [9] have made major advances in this direction, reporting that sites of human genome double-strand breaks (DSBs) occur frequently at sites in rDNA that are tightly linked with active transcription - the authors used a RAFT (rapid amplification of forum termini) protocol that selects for blunt-ended sites. They reported the relative frequency of these rDNA DSBs within defined co-ordinate 'windows' of varying size and made these data (as well as the relevant 'raw' sequencing information) available to the public (Tchurikov et al., 2015b). Assay designs targeting rDNA DSB hotspots will benefit greatly from the publication of break sites at greater resolution. Here, we re-analyse public RAFT data and make available rDNA DSB co-ordinates to the single-nucleotide level.

UNDR ROVER - a fast and accurate variant caller for targeted DNA sequencing, BMC Bioinformatics, 2016.

Authors

Park DJ, Li R, Lau E, Georgeson P, Nguyen-Dumont T, Pope BJ

Journal

BMC Bioinformatics

Volume

17

Issue

1

Year

2016

DOI

10.1186/s12859-016-1014-9

URL

http://dx.doi.org/10.1186/s12859-016-1014-9

Abstract

Background:Previously, we described ROVER, a DNA variant caller which identifies genetic variants from PCR-targeted massively parallel sequencing (MPS) datasets generated by the Hi-Plex protocol. ROVER permits stringent filtering of sequencing chemistry-induced errors by requiring reported variants to appear in both reads of overlapping pairs above certain thresholds of occurrence. ROVER was developed in tandem with Hi-Plex and has been used successfully to screen for genetic mutations in the breast cancer predisposition gene PALB2.
ROVER is applied to MPS data in BAM format and, therefore, relies on sequence reads being mapped to a reference genome. In this paper, we describe an improvement to ROVER, called UNDR ROVER (Unmapped primer-Directed ROVER), which accepts MPS data in FASTQ format, avoiding the need for a computationally expensive mapping stage. It does so by taking advantage of the location-specific nature of PCR-targeted MPS data.
Results: The UNDR ROVER algorithm achieves the same stringent variant calling as its predecessor with a significant runtime performance improvement. In one indicative sequencing experiment, UNDR ROVER (in its fastest mode) required 8-fold less sequential computation time than the ROVER pipeline and 13-fold less sequential computation time than a variant calling pipeline based on the popular GATK tool.
UNDR ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). It requires as input a tab-delimited format file containing primer sequence information, a FASTA format file containing the reference genome sequence, and paired FASTQ files containing sequence reads. Primer sequences at the 5′ end of reads associate read-pairs with their targeted amplicon and, thus, their expected corresponding coordinates in the reference genome. The primer-intervening sequence of each read is compared against the reference sequence from the same location and variants are identified using the same algorithm as ROVER. Specifically, for a variant to be ‘called’ it must appear at the same location in both of the overlapping reads above user-defined thresholds of minimum number of reads and proportion of reads.
Conclusions: UNDR ROVER provides the same rapid and accurate genetic variant calling as its predecessor with greatly reduced computational costs.

High-resolution twin-ion metabolite extraction (HiTIME) mass spectrometry: nontargeted detection of unknown drug metabolites by isotope labeling, liquid chromatography mass spectrometry, and automated high-performance computing, Analytical Chemistry, 2015.

Authors

Leeming MG, Isaac AP, Pope BJ, Cranswick N, Wright CE, Ziogas J, O'Hair RA, Donald WA

Journal

Analytical Chemistry

Volume

87

Issue

8

Year

2015

DOI

10.1021/ac504767d

Pubmed ID

25818563

URL

http://dx.doi.org/10.1021/ac504767d

Abstract

'The metabolic fate of a compound can often determine the success of a new drug lead. Thus, significant effort is directed toward identifying the metabolites formed from a given molecule. Here, an automated and nontargeted procedure is introduced for detecting drug metabolites without authentic metabolite standards via the use of stable isotope labeling, liquid chromatography mass spectrometry (LC/MS), and high-performance computing. LC/MS of blood plasma extracts from rats that were administered a 1:1 mixture of acetaminophen (APAP) and 13C6-APAP resulted in mass spectra that contained “twin” ions for drug metabolites that were not detected in control spectra (i.e., no APAP administered). Because of the development of a program (high-resolution twin-ion metabolite extraction; HiTIME) that can identify twin-ions in high-resolution mass spectra without centroiding (i.e., reduction of mass spectral peaks to single data points), 9 doublets corresponding to APAP metabolites were identified. This is nearly twice that obtained by use of existing programs that make use of centroiding to reduce computational cost under these conditions with a quadrupole time-of-flight mass spectrometer. By a manual search for all reported APAP metabolite ions, no additional twin-ion signals were assigned. These data indicate that all the major metabolites of APAP and multiple low-abundance metabolites (e.g., acetaminophen hydroxy- and methoxysulfate) that are rarely reported were detected. This methodology can be used to detect drug metabolites without prior knowledge of their identity. HiTIME is freely available from https://github.com/bjpop/HiTIME'

Exemplary multiplex bisulfite amplicon data used to demonstrate the utility of Methpat, GigaScience, 2015.

Authors

Wong NC, Pope BJ, Candiloro I, Korbie D, Trau M, Wong SQ, Mikeska T, van Denderen BJ, Thompson EW, Eggers S, Doyle SR, Dobrovic A

Journal

GigaScience

Volume

4

Year

2015

DOI

10.1186/s13742-015-0098-x

Pubmed ID

26613017

URL

http://dx.doi.org/10.1186/s13742-015-0098-x

Keywords

Bisulfite sequencing; Cancer; DNA methylation; Epialleles; Epigenetics; PCR; Visualization

Abstract

BACKGROUND: DNA methylation is a complex epigenetic marker that can be analyzed using a wide variety of methods. Interpretation and visualization of DNA methylation data can mask complexity in terms of methylation status at each CpG site, cellular heterogeneity of samples and allelic DNA methylation patterns within a given DNA strand. Bisulfite sequencing is considered the gold standard, but visualization of massively parallel sequencing results remains a significant challenge. FINDINGS: We created a program called Methpat that facilitates visualization and interpretation of bisulfite sequencing data generated by massively parallel sequencing. To demonstrate this, we performed multiplex PCR that targeted 48 regions of interest across 86 human samples. The regions selected included known gene promoters associated with cancer, repetitive elements, known imprinted regions and mitochondrial genomic sequences. We interrogated a range of samples including human cell lines, primary tumours and primary tissue samples. Methpat generates two forms of output: a tab-delimited text file for each sample that summarizes DNA methylation patterns and their read counts for each amplicon, and a HTML file that summarizes this data visually. Methpat can be used with publicly available whole genome bisulfite sequencing and reduced representation bisulfite sequencing datasets with sufficient read depths. CONCLUSIONS: Using Methpat, complex DNA methylation data derived from massively parallel sequencing can be summarized and visualized for biological interpretation. By accounting for allelic DNA methylation states and their abundance in a sample, Methpat can unmask the complexity of DNA methylation and yield further biological insight in existing datasets.

Abridged adapter primers increase the target scope of Hi-Plex, BioTechniques, 2015.

Authors

Nguyen-Dumont T, Hammet F, Mahmoodi M, Pope BJ, Giles GG, Hopper JL, Southey MC, Park DJ

Journal

BioTechniques

Volume

58

Issue

1

Year

2015

DOI

10.2144/000114247

Pubmed ID

25605578

URL

http://dx.doi.org/10.2144/000114247

Keywords

Hi-Plex; genetic screening; massively parallel sequencing; mutation screening; next-generation sequencing; targeted sequencing

Abstract

Previously, we reported Hi-Plex, an amplicon-based method for targeted massively parallel sequencing capable of generating 60 amplicons simultaneously. In further experiments, however, we found our approach did not scale to higher amplicon numbers. Here, we report a modification to the original Hi-Plex protocol that includes the use of abridged adapter oligonucleotides as universal primers (bridge primers) in the initial PCR mixture. Full-length adapter primers (indexing primers) are included only during latter stages of thermal cycling with concomitant application of elevated annealing temperatures. Using this approach, we demonstrate the application of Hi-Plex across a broad range of amplicon numbers (16-plex, 62-plex, 250-plex, and 1003-plex) while preserving the low amount (25 ng) of input DNA required.

Mutation screening of PALB2 in clinically ascertained families from the Breast Cancer Family Registry, Breast Cancer Research and Treatment, 2015.

Authors

Nguyen-Dumont T, Hammet F, Mahmoodi M, Tsimiklis H, Teo ZL, Li R, Pope BJ, Terry MB, Buys SS, Daly M, Hopper JL, Winship I, Goldgar DE, Park DJ, Southey MC

Journal

Breast Cancer Research and Treatment

Volume

149

Issue

2

Year

2015

DOI

10.1007/s10549-014-3260-8

Pubmed ID

25575445

URL

http://dx.doi.org/10.1007/s10549-014-3260-8

Abstract

Loss-of-function mutations in PALB2 are associated with an increased risk of breast cancer, with recent data showing that female breast cancer risks for PALB2 mutation carriers are comparable in magnitude to those for BRCA2 mutation carriers. This study applied targeted massively parallel sequencing to characterize the mutation spectrum of PALB2 in probands attending breast cancer genetics clinics in the USA. The coding regions and proximal intron-exon junctions of PALB2 were screened in probands not known to carry a mutation in BRCA1 or BCRA2 from 1,250 families enrolled through familial cancer clinics by the Breast Cancer Family Registry. Mutation screening was performed using Hi-Plex, an amplicon-based targeted massively parallel sequencing platform. Screening of PALB2 was successful in 1,240/1,250 probands and identified nine women with protein-truncating mutations (three nonsense mutations and five frameshift mutations). Four of the 33 missense variants were predicted to be deleterious to protein function by in silico analysis using two different programs. Analysis of tumors from carriers of truncating mutations revealed that the majority were high histological grade, invasive ductal carcinomas. Young onset was apparent in most families, with 19 breast cancers under 50 years of age, including eight under the age of 40 years. Our data demonstrate the utility of Hi-Plex in the context of high-throughput testing for rare genetic mutations and provide additional timely information about the nature and prevalence of PALB2 mutations, to enhance risk assessment and risk management of women at high risk of cancer attending clinical genetic services.

SRST2: Rapid genomic surveillance for public health and hospital microbiology labs, Genome Medicine, 2014.

Authors

Inouye M, Dashnow H, Raven LA, Schultz MB, Pope BJ, Tomita T, Zobel J, Holt KE

Journal

Genome Medicine

Volume

6

Issue

11

Year

2014

DOI

10.1186/s13073-014-0090-6

Pubmed ID

25422674

URL

http://dx.doi.org/10.1186/s13073-014-0090-6

Abstract

Rapid molecular typing of bacterial pathogens is critical for public health epidemiology, surveillance and infection control, yet routine use of whole genome sequencing (WGS) for these purposes poses significant challenges. Here we present SRST2, a read mapping-based tool for fast and accurate detection of genes, alleles and multi-locus sequence types (MLST) from WGS data. Using >900 genomes from common pathogens, we show SRST2 is highly accurate and outperforms assembly-based methods in terms of both gene detection and allele assignment. We include validation of SRST2 within a public health laboratory, and demonstrate its use for microbial genome surveillance in the hospital setting. In the face of rising threats of antimicrobial resistance and emerging virulence among bacterial pathogens, SRST2 represents a powerful tool for rapidly extracting clinically useful information from raw WGS data. Source code is available from http://katholt.github.io/srst2/.

ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets, Source Code for Biology and Medicine, 2014.

Authors

Pope BJ, Nguyen-Dumont T, Hammet F, Park DJ

Journal

Source Code for Biology and Medicine

Volume

9

Issue

1

Year

2014

DOI

10.1186/1751-0473-9-3

Pubmed ID

24461215

URL

http://dx.doi.org/10.1186/1751-0473-9-3

Abstract

BACKGROUND: We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi-Plex-derived datasets can thus rely on the identification of variants appearing in both reads of read-pairs, permitting stringent filtering of sequencing chemistry-induced errors. These principles underly ROVER software (derived from Read Overlap PCR-MPS variant caller), which we have recently used to report the screening for genetic mutations in the breast cancer predisposition gene PALB2. Here, we describe the algorithms underlying ROVER and its usage. RESULTS: ROVER enables users to quickly and accurately identify genetic variants from PCR-targeted, overlapping paired-end MPS datasets. The open-source availability of the software and threshold tailorability enables broad access for a range of PCR-MPS users. METHODS: ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). The software accepts a tab-delimited text file listing the coordinates of the target-specific primers used for targeted enrichment based on a specified genome-build. It also accepts aligned sequence files resulting from mapping to the same genome-build. ROVER identifies the amplicon a given read-pair represents and removes the primer sequences by using the mapping co-ordinates and primer co-ordinates. It considers overlapping read-pairs with respect to primer-intervening sequence. Only when a variant is observed in both reads of a read-pair does the signal contribute to a tally of read-pairs containing or not containing the variant. A user-defined threshold informs the minimum number of, and proportion of, read-pairs a variant must be observed in for a 'call' to be made. ROVER also reports the depth of coverage across amplicons to facilitate the identification of any regions that may require further screening. CONCLUSIONS: ROVER can facilitate rapid and accurate genetic variant calling for a broad range of PCR-MPS users.

Rare mutations in RINT1 predispose carriers to breast and Lynch syndrome-spectrum cancers, Cancer Discovery, 2014.

Authors

Park DJ, Tao K, Le Calvez-Kelm F, Nguyen-Dumont T, Robinot N, Hammet F, Odefrey F, Tsimiklis H, Teo ZL, Thingholm LB, Young EL, Voegele C, Lonie A, Pope BJ, Roane TC, Bell R, Hu H, Shankaracharya, Huff CD, Ellis J, Li J, Makunin IV, John EM, Andrulis IL, Terry MB, Daly M, Buys SS, Snyder C, Lynch HT, Devilee P, Giles GG, Hopper JL, Feng BJ, Lesueur F, Tavtigian SV, Southey MC, Goldgar DE

Journal

Cancer Discovery

Volume

4

Issue

7

Year

2014

DOI

10.1158/2159-8290.CD-14-0212

Pubmed ID

25050558

URL

http://dx.doi.org/10.1158/2159-8290.CD-14-0212

Abstract

Approximately half of the familial aggregation of breast cancer remains unexplained. A multiple-case breast cancer family exome-sequencing study identified three likely pathogenic mutations in RINT1 (NM_021930.4) not present in public sequencing databases: RINT1 c.343C>T (p.Q115X), c.1132_1134del (p.M378del), and c.1207G>T (p.D403Y). On the basis of this finding, a population-based case-control mutation-screening study was conducted that identified 29 carriers of rare (minor allele frequency < 0.5%), likely pathogenic variants: 23 in 1,313 early-onset breast cancer cases and six in 1,123 frequency-matched controls [OR, 3.24; 95% confidence interval (CI), 1.29-8.17; P = 0.013]. RINT1 mutation screening of probands from 798 multiple-case breast cancer families identified four additional carriers of rare genetic variants. Analysis of the incidence of first primary cancers in families of women carrying RINT1 mutations estimated that carriers were at increased risk of Lynch syndrome-spectrum cancers [standardized incidence ratio (SIR), 3.35; 95% CI, 1.7-6.0; P = 0.005], particularly for relatives diagnosed with cancer under the age of 60 years (SIR, 10.9; 95% CI, 4.7-21; P = 0.0003). SIGNIFICANCE: The work described in this study adds RINT1 to the growing list of genes in which rare sequence variants are associated with intermediate levels of breast cancer risk. Given that RINT1 is also associated with a spectrum of cancers with mismatch repair defects, these findings have clinical applications and raise interesting biological questions.

Annokey: an annotation tool based on key term search of the NCBI Entrez Gene database, Source Code for Biology and Medicine, 2014.

Authors

Park DJ, Nguyen-Dumont T, Kang S, Verspoor K, Pope BJ

Journal

Source Code for Biology and Medicine

Volume

9

Issue

1

Year

2014

DOI

10.1186/1751-0473-9-15

URL

http://dx.doi.org/10.1186/1751-0473-9-15

Abstract

Background: The NCBI Entrez Gene and PubMed databases contain a wealth of high-quality information about genes for many different organisms. The NCBI Entrez online web-search interface is convenient for simple manual search for a small number of genes but impractical for the kinds of outputs seen in typical genomics projects.
Results: We have developed an efficient open source tool implemented in Python called Annokey, which annotates gene lists with the results of a keyword search of the NCBI Entrez Gene database and linked Pubmed article information. The user steers the search by specifying a ranked list of keywords (including multi-word phrases and regular expressions) that are correlated with their topic of interest. Rank information of matched terms allows the user to guide further investigation.
We applied Annokey to the entire human Entrez Gene database using the key-term “DNA repair” and assessed its performance in identifying the 176 members of a published “gold standard” list of genes established to be involved in this pathway. For this test case we observed a sensitivity and specificity of 97% and 96%, respectively.
Conclusions: Annokey facilitates the identification of genes related to an area of interest, a task which can be onerous if performed manually on a large number of genes. Annokey provides a way to capitalize on the high quality information provided by the Entrez Gene database allowing both scalability and compatibility with automated analysis pipelines, thus offering the potential to significantly enhance research productivity.

MYRF is a membrane-associated transcription factor that autoproteolytically cleaves to directly activate myelin genes, PLoS Biology, 2013.

Authors

Bujalka H, Koenning M, Jackson S, Perreau VM, Pope B, Hay CM, Mitew S, Hill AF, Lu QR, Wegner M, Srinivasan R, Svaren J, Willingham M, Barres BA, Emery B

Journal

PLoS Biology

Volume

11

Issue

8

Year

2013

DOI

10.1371/journal.pbio.1001625

Pubmed ID

23966833

URL

http://dx.doi.org/10.1371/journal.pbio.1001625

Abstract

The myelination of axons is a crucial step during vertebrate central nervous system (CNS) development, allowing for rapid and energy efficient saltatory conduction of nerve impulses. Accordingly, the differentiation of oligodendrocytes, the myelinating cells of the CNS, and their expression of myelin genes are under tight transcriptional control. We previously identified a putative transcription factor, Myelin Regulatory Factor (Myrf), as being vital for CNS myelination. Myrf is required for the generation of CNS myelination during development and also for its maintenance in the adult. It has been controversial, however, whether Myrf directly regulates transcription, with reports of a transmembrane domain and lack of nuclear localization. Here we show that Myrf is a membrane-associated transcription factor that undergoes an activating proteolytic cleavage to separate its transmembrane domain-containing C-terminal region from a nuclear-targeted N-terminal region. Unexpectedly, this cleavage event occurs via a protein domain related to the autoproteolytic intramolecular chaperone domain of the bacteriophage tail spike proteins, the first time this domain has been found to play a role in eukaryotic proteins. Using ChIP-Seq we show that the N-terminal cleavage product directly binds the enhancer regions of oligodendrocyte-specific and myelin genes. This binding occurs via a defined DNA-binding consensus sequence and strongly promotes the expression of target genes. These findings identify Myrf as a novel example of a membrane-associated transcription factor and provide a direct molecular mechanism for its regulation of oligodendrocyte differentiation and CNS myelination.

FAVR (Filtering and Annotation of Variants that are Rare): methods to facilitate the analysis of rare germline genetic variants from massively parallel sequencing datasets, BMC Bioinformatics, 2013.

Authors

Pope BJ, Nguyen-Dumont T, Odefrey F, Hammet F, Bell R, Tao K, Tavtigian SV, Goldgar DE, Lonie A, Southey MC, Park DJ

Journal

BMC Bioinformatics

Volume

14

Year

2013

DOI

10.1186/1471-2105-14-65

Pubmed ID

23441864

URL

http://dx.doi.org/10.1186/1471-2105-14-65

Abstract

BACKGROUND: Characterising genetic diversity through the analysis of massively parallel sequencing (MPS) data offers enormous potential to significantly improve our understanding of the genetic basis for observed phenotypes, including predisposition to and progression of complex human disease. Great challenges remain in resolving genetic variants that are genuine from the millions of artefactual signals. RESULTS: FAVR is a suite of new methods designed to work with commonly used MPS analysis pipelines to assist in the resolution of some of the issues related to the analysis of the vast amount of resulting data, with a focus on relatively rare genetic variants. To the best of our knowledge, no equivalent method has previously been described. The most important and novel aspect of FAVR is the use of signatures in comparator sequence alignment files during variant filtering, and annotation of variants potentially shared between individuals. The FAVR methods use these signatures to facilitate filtering of (i) platform and/or mapping-specific artefacts, (ii) common genetic variants, and, where relevant, (iii) artefacts derived from imbalanced paired-end sequencing, as well as annotation of genetic variants based on evidence of co-occurrence in individuals. We applied conventional variant calling applied to whole-exome sequencing datasets, produced using both SOLiD and TruSeq chemistries, with or without downstream processing by FAVR methods. We demonstrate a 3-fold smaller rare single nucleotide variant shortlist with no detected reduction in sensitivity. This analysis included Sanger sequencing of rare variant signals not evident in dbSNP131, assessment of known variant signal preservation, and comparison of observed and expected rare variant numbers across a range of first cousin pairs. The principles described herein were applied in our recent publication identifying XRCC2 as a new breast cancer risk gene and have been made publically available as a suite of software tools. CONCLUSIONS: FAVR is a platform-agnostic suite of methods that significantly enhances the analysis of large volumes of sequencing data for the study of rare genetic variants and their influence on phenotypes.

A high-plex PCR approach for massively parallel sequencing, BioTechniques, 2013.

Authors

Nguyen-Dumont T, Pope BJ, Hammet F, Southey MC, Park DJ

Journal

BioTechniques

Volume

55

Issue

2

Year

2013

DOI

10.2144/000114052

Pubmed ID

23931594

URL

http://dx.doi.org/10.2144/000114052

Abstract

Current methods for targeted massively parallel sequencing (MPS) have several drawbacks, including limited design flexibility, expense, and protocol complexity, which restrict their application to settings involving modest target size and requiring low cost and high throughput. To address this, we have developed Hi-Plex, a PCR-MPS strategy intended for high-throughput screening of multiple genomic target regions that integrates simple, automated primer design software to control product size. Featuring permissive thermocycling conditions and clamp bias reduction, our protocol is simple, cost- and time-effective, uses readily available reagents, does not require expensive instrumentation, and requires minimal optimization. In a 60-plex assay targeting the breast cancer predisposition genes PALB2 and XRCC2, we applied Hi-Plex to 100 ng LCL-derived DNA, and 100 ng and 25 ng FFPE tumor-derived DNA. Altogether, at least 86.94% of the human genome-mapped reads were on target, and 100% of targeted amplicons were represented within 25-fold of the mean. Using 25 ng FFPE-derived DNA, 95.14% of mapped reads were on-target and relative representation ranged from 10.1-fold lower to 5.8-fold higher than the mean. These results were obtained using only the initial automatically-designed primers present in equal concentration. Hi-Plex represents a powerful new approach for screening panels of genomic target regions.

Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing, Analytical biochemistry, 2013.

Authors

Nguyen-Dumont T, Pope BJ, Hammet F, Mahmoodi M, Tsimiklis H, Southey MC, Park DJ

Journal

Analytical biochemistry

Volume

442

Issue

2

Year

2013

DOI

10.1016/j.ab.2013.07.046

Pubmed ID

23933242

URL

http://dx.doi.org/10.1016/j.ab.2013.07.046

Keywords

Disease gene screening; High-Plex PCR; Massively parallel sequencing; Molecular diagnostics; Targeted sequencing

Abstract

Although per-base sequencing costs have decreased during recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility, and protocol complexity. To address these limitations, we previously developed Hi-Plex, a polymerase chain reaction (PCR) massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible, and rapid approach for various genetic screening applications.

Hi-Plex for high-throughput mutation screening: application to the breast cancer susceptibility gene PALB2, BMC medical genomics, 2013.

Authors

Nguyen-Dumont T, Teo ZL, Pope BJ, Hammet F, Mahmoodi M, Tsimiklis H, Sabbaghian N, Tischkowitz M, Foulkes WD, Kathleen Cuningham Foundation Consortium for research into Familial Breast cancer (kConFab), Giles GG, Hopper JL, Australian Breast Cancer Family Registry, Southey MC, Park DJ

Journal

BMC medical genomics

Volume

6

Year

2013

DOI

10.1186/1755-8794-6-48

Pubmed ID

24206657

URL

http://dx.doi.org/10.1186/1755-8794-6-48

Abstract

BACKGROUND: Massively parallel sequencing (MPS) has revolutionised biomedical research and offers enormous capacity for clinical application. We previously reported Hi-Plex, a streamlined highly-multiplexed PCR-MPS approach, allowing a given library to be sequenced with both the Ion Torrent and TruSeq chemistries. Comparable sequencing efficiency was achieved using material derived from lymphoblastoid cell lines and formalin-fixed paraffin-embedded tumour. METHODS: Here, we report high-throughput application of Hi-Plex by performing blinded mutation screening of the coding regions of the breast cancer susceptibility gene PALB2 on a set of 95 blood-derived DNA samples that had previously been screened using Sanger sequencing and high-resolution melting curve analysis (n = 90), or genotyped by Taqman probe-based assays (n = 5). Hi-Plex libraries were prepared simultaneously using relatively inexpensive, readily available reagents in a simple half-day protocol followed by MPS on a single MiSeq run. RESULTS: We observed that 99.93% of amplicons were represented at ≥10X coverage. All 56 previously identified variant calls were detected and no false positive calls were assigned. Four additional variant calls were made and confirmed upon re-analysis of previous data or subsequent Sanger sequencing. CONCLUSIONS: These results support Hi-Plex as a powerful approach for rapid, cost-effective and accurate high-throughput mutation screening. They further demonstrate that Hi-Plex methods are suitable for and can meet the demands of high-throughput genetic testing in research and clinical settings.

Identification of new breast cancer predisposition genes via whole exome sequencing, Hereditary cancer in clinical practice, 2012.

Authors

Southey MC, Park DJ, Lesueur F, Odefrey F, Nguyen-Dumont T, Hammet F, Neuhausen SL, John EM, Andrulis IL, Chenevix-Trench G, Baglietto L, Le Calvez-Kelm F, Pertesi M, Lonie A, Pope B, Sinilnikova O, Tsimiklis H, Giles GG, Hopper JL, Tavtigian SV, Goldgar DE

Journal

Hereditary cancer in clinical practice

Volume

10

Issue

2

Year

2012

DOI

10.1186/1897-4287-10-S2-A40

URL

http://dx.doi.org/10.1186/1897-4287-10-S2-A40

Expanded genetic analysis of a PALB2 c. 3113G> A mutation carrying multiple-case breast cancer family via exome sequencing, Hereditary cancer in clinical practice, 2012.

Authors

Teo ZL, Park DJ, Odefrey F, Hammet F, Nguyen-Dumont T, Tsimiklis H, Pope BJ, Lonie A, Winship I, Giles GG, Others

Journal

Hereditary cancer in clinical practice

Volume

10

Issue

2

Year

2012

URL

https://hccpjournal.biomedcentral.com/articles/10.1186/1897-4287-10-S2-A92

Bpipe: a tool for running and managing bioinformatics pipelines, Bioinformatics , 2012.

Authors

Sadedin SP, Pope B, Oshlack A

Journal

Bioinformatics

Volume

28

Issue

11

Year

2012

DOI

10.1093/bioinformatics/bts167

Pubmed ID

22500002

URL

http://dx.doi.org/10.1093/bioinformatics/bts167

Abstract

SUMMARY: Bpipe is a simple, dedicated programming language for defining and executing bioinformatics pipelines. It specializes in enabling users to turn existing pipelines based on shell scripts or command line tools into highly flexible, adaptable and maintainable workflows with a minimum of effort. Bpipe ensures that pipelines execute in a controlled and repeatable fashion and keeps audit trails and logs to ensure that experimental results are reproducible. Requiring only Java as a dependency, Bpipe is fully self-contained and cross-platform, making it very easy to adopt and deploy into existing environments. AVAILABILITY AND IMPLEMENTATION: Bpipe is freely available from http://bpipe.org under a BSD License.

Rare mutations in XRCC2 increase the risk of breast cancer, American journal of human genetics, 2012.

Authors

Park DJ, Lesueur F, Nguyen-Dumont T, Pertesi M, Odefrey F, Hammet F, Neuhausen SL, John EM, Andrulis IL, Terry MB, Daly M, Buys S, Le Calvez-Kelm F, Lonie A, Pope BJ, Tsimiklis H, Voegele C, Hilbers FM, Hoogerbrugge N, Barroso A, Osorio A, Breast Cancer Family Registry, Kathleen Cuningham Foundation Consortium for Research into Familial Breast Cancer, Giles GG, Devilee P, Benitez J, Hopper JL, Tavtigian SV, Goldgar DE, Southey MC

Journal

American journal of human genetics

Volume

90

Issue

4

Year

2012

DOI

10.1016/j.ajhg.2012.02.027

Pubmed ID

22464251

URL

http://dx.doi.org/10.1016/j.ajhg.2012.02.027

Abstract

An exome-sequencing study of families with multiple breast-cancer-affected individuals identified two families with XRCC2 mutations, one with a protein-truncating mutation and one with a probably deleterious missense mutation. We performed a population-based case-control mutation-screening study that identified six probably pathogenic coding variants in 1,308 cases with early-onset breast cancer and no variants in 1,120 controls (the severity grading was p < 0.02). We also performed additional mutation screening in 689 multiple-case families. We identified ten breast-cancer-affected families with protein-truncating or probably deleterious rare missense variants in XRCC2. Our identification of XRCC2 as a breast cancer susceptibility gene thus increases the proportion of breast cancers that are associated with homologous recombination-DNA-repair dysfunction and Fanconi anemia and could therefore benefit from specific targeted treatments such as PARP (poly ADP ribose polymerase) inhibitors. This study demonstrates the power of massively parallel sequencing for discovering susceptibility genes for common, complex diseases.

A Computational Model for Retinal Ganglion Cell Axon Pathfinding, Investigative Ophthalmology & Visual Science, 2012.

Authors

Andrew Turpin, Bernard Pope, Jonathan Denniss

Journal

Investigative Ophthalmology & Visual Science

Volume

53

Issue

14

Year

2012

URL

https://iovs.arvojournals.org/article.aspx?articleid=2358170

Abstract

Purpose: To investigate whether an algorithm for simulating human retinal ganglion cell (RGC) axon pathfinding during development that does not rely on a chemical gradient produces plausible axon paths. Methods: A computational model of human RGC axon pathfinding was developed as follows. The algorithm assumes a lattice covering the retina that is populated with RGCs according to known average cell density for a healthy human retina (Curcio CA, Allen KA, J Comp Neurol, 1990: 300; 5-25). Each cell sends an axon growth cone out in "search mode" that locates a nearby axon. When an axon is located, the growth cone enters "bundle mode", and follows on top of the axon until a critical retinal thickness constraint is met. If there is no room to stack on top of the existing axon, the growth cone reverts to search mode, and finds either an unfilled blank location, or another axon to follow.Two restrictions on this basic algorithm prevent random axon pathways forming. Firstly, cells nearest the optic nerve head (ONH) develop first, so following an existing axon will take a new axon towards the ONH. Secondly, when in search mode, searching only happens within ±90 degrees of the current trajectory.The thickness constraint is mediated as a linear function beginning at zero at the fovea, and reaching a maximum of 60 axons at a distance of 3mm from the fovea. There is no thickness constraint outside of a radius of 3mm from the fovea.Output from the model was compared with a previous mathematical model of retinal nerve fibre bundles based on hand-drawn traces of human RGC axon pathways (Jansonius NM et al, Vis Res, 2009: 49; 2157-63). Results: Allowing for a plus or minus 5 degree error in the entry point of the ONH, 60% of our generated paths in the superior retina fell within the 95% confidence limits given by Jansonius, and 17% inferiorly. Allowing for plus or minus 10 degrees: 83% superior; and 33% inferior. The presence of a chemical gradient was not necessary for such patterns to be produced. Generally our inferior pathways were straighter than those reported by Jansonius. A non-symmetrical thickness constraint may be required to increase the curvature of inferior axon pathways. Conclusions: Local cues are sufficient to guide RGC axons across the retina to the ONH in the absence of a chemical gradient. This model does not exclude the existence of a gradient, or partial local gradients, but simply demonstrates that a gradient is not essential to support current empirical data.

Precision Medicine: Dawn of Supercomputing in ‘omics Research, 2011.

Authors

Reumann M, Holt KE, Inouye M, Stinear T, Goudey B, Abraham G, WANG Q, Shi F, Kowalczyk A, Pearce A, Isaac A, Pope BJ, Butzkueven H, Wagner J, Moore S, Downton M, Church PC, Turner SJ, Field J, Southey M, Bowtell D, Schmidt D, Makalic E, Zobel J, Hopper J, Petrovski S, O'Brien T

Year

2011

Performance of hybrid programming models for multiscale cardiac simulations: preparing for petascale computation, IEEE transactions on bio-medical engineering, 2011.

Authors

Pope BJ, Fitch BG, Pitman MC, Rice JJ, Reumann M

Journal

IEEE transactions on bio-medical engineering

Volume

58

Issue

10

Year

2011

DOI

10.1109/TBME.2011.2161580

Pubmed ID

21768044

URL

http://dx.doi.org/10.1109/TBME.2011.2161580

Abstract

Future multiscale and multiphysics models that support research into human disease, translational medical science, and treatment can utilize the power of high-performance computing (HPC) systems. We anticipate that computationally efficient multiscale models will require the use of sophisticated hybrid programming models, mixing distributed message-passing processes [e.g., the message-passing interface (MPI)] with multithreading (e.g., OpenMP, Pthreads). The objective of this study is to compare the performance of such hybrid programming models when applied to the simulation of a realistic physiological multiscale model of the heart. Our results show that the hybrid models perform favorably when compared to an implementation using only the MPI and, furthermore, that OpenMP in combination with the MPI provides a satisfactory compromise between performance and code complexity. Having the ability to use threads within MPI processes enables the sophisticated use of all processor cores for both computation and communication phases. Considering that HPC systems in 2012 will have two orders of magnitude more cores than what was used in this study, we believe that faster than real-time multiscale cardiac simulations can be achieved on these systems.

Petascale computation performance of lightweight multiscale cardiac models using hybrid programming models, Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference, 2011.

Authors

Pope BJ, Fitch BG, Pitman MC, Rice JJ, Reumann M

Journal

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference

Volume

2011

Year

2011

DOI

10.1109/IEMBS.2011.6090058

Pubmed ID

22254341

URL

http://dx.doi.org/10.1109/IEMBS.2011.6090058

Abstract

Future multiscale and multiphysics models must use the power of high performance computing (HPC) systems to enable research into human disease, translational medical science, and treatment. Previously we showed that computationally efficient multiscale models will require the use of sophisticated hybrid programming models, mixing distributed message passing processes (e.g. the message passing interface (MPI)) with multithreading (e.g. OpenMP, POSIX pthreads). The objective of this work is to compare the performance of such hybrid programming models when applied to the simulation of a lightweight multiscale cardiac model. Our results show that the hybrid models do not perform favourably when compared to an implementation using only MPI which is in contrast to our results using complex physiological models. Thus, with regards to lightweight multiscale cardiac models, the user may not need to increase programming complexity by using a hybrid programming approach. However, considering that model complexity will increase as well as the HPC system size in both node count and number of cores per node, it is still foreseeable that we will achieve faster than real time multiscale cardiac simulations on these systems using hybrid programming models.

A Lightweight Interactive Debugger for Haskell, Haskell '07 Proceedings of the ACM SIGPLAN workshop on Haskell, 2007.

Authors

Simon Marlow, Jose Iborra, Bernard Pope, Andy Gill

Journal

Haskell '07 Proceedings of the ACM SIGPLAN workshop on Haskell

Year

2007

DOI

10.1145/1291201.1291204

URL

http://dx.doi.org/10.1145/1291201.1291204

Abstract

This paper describes the design and construction of a Haskell source-level debugger built into the GHCi interactive environment. We have taken a pragmatic approach: the debugger is based on the traditional stop-examine-continue model of online debugging, which is simple and intuitive, but has traditionally been shunned in the context of Haskell because it exposes the lazy evaluation order. We argue that this drawback is not as severe as it may seem, and in some cases is an advantage. The design focuses on availability: our debugger is intended to work on all programs that can be compiled with GHC, and without requiring the programmer to jump through additional hoops to debug their program. The debugger has a novel approach for reconstructing the type of runtime values in a polymorphic context. Our implementation is light on complexity, and was integrated into GHC without significant upheaval.

A Declarative Debugger for Haskell, 2006.

Authors

Bernard Pope

Year

2006

URL

assets/files/BerniePope.PhD.Thesis.pdf

Abstract

This thesis is about the design and implementation of a debugging tool which helps Haskell programmers understand why their programs do not work as intended. The traditional debugging technique of examining the program execution step-by-step, popular with imperative languages, is less suitable for Haskell because its unorthodox evaluation strategy is difficult to relate to the structure of the original program source code. We build a debugger which focuses on the high-level logical meaning of a program rather than its evaluation order. This style of debugging is called declarative debugging, and it originated in logic programming languages. At the heart of the debugger is a tree which records information about the evaluation of the program in a manner which is easy to relate to the structure of the program. Links between nodes in the tree reflect logical relationships between entities in the source code. An error diagnosis algorithm is applied to the tree in a top-down fashion, searching for causes of bugs. The search is guided by an oracle, who knows how each part of the program should behave. The oracle is normally a human — typically the person who wrote the program — however, much of its behaviour can be encoded in software.
An interesting aspect of this work is that the debugger is implemented by means of a program transformation. That is, the program which is to be debugged is trans- formed into a new one, which when evaluated, behaves like the original program but also produces the evaluation tree as a side-effect. The transformed program is augmented with code to perform the error diagnosis on the tree. Running the trans- formed program constitutes the evaluation of the original program plus a debugging session. The use of program transformation allows the debugger to take advantage of existing compiler technology — a whole new compiler and runtime environment does not need to be written — which saves much work and enhances portability.
The technology described in this thesis is well-tested by an implementation in software. The result is a useful tool, called buddha, which is publicly available and supports all of the Haskell 98 standard.

Declarative Debugging with Buddha, Advanced Functional Programming. AFP 2004. Lecture Notes in Computer Science, 2004.

Authors

Bernard Pope

Journal

Advanced Functional Programming. AFP 2004. Lecture Notes in Computer Science

Volume

3622

Year

2004

URL

https://link.springer.com/chapter/10.1007/11546382_7

Abstract

Haskell is a very safe language, particularly because of its type system. However there will always be programs that do the wrong thing. Programmer fallibility, partial or incorrect specifications and typographic errors are but a few of the reasons that make bugs a fact of life. This paper is about the use and implementation of a debugger, called Buddha, which helps Haskell programmers understand why their programs misbehave. Traditional debugging tools that examine the program execution step-by-step are not suitable for Haskell because of its unorthodox evaluation strategy. Instead, a different approach is taken which abstracts away the evaluation order of the program and focuses on its high-level logical meaning.
This style of debugging is called Declarative Debugging, and it has its roots in the Logic Programming community. At the heart of the debugger is a tree which records information about the evaluation of the program in a manner which is easy to relate to the structure of the source code. It resembles a call graph annotated with the arguments and results of function applications, shown in their most evaluated form. Logical relationships between entities in the source are reflected in the links between nodes in the tree. An error diagnosis algorithm is applied to the tree in a top-down fashion in the search for causes of bugs.

Practical Aspects of Declarative Debugging in Haskell 98, Proceedings of the 5th ACM SIGPLAN international conference on Principles and practice of declaritive programming, 2003.

Authors

Bernard Pope, Lee Naish

Journal

Proceedings of the 5th ACM SIGPLAN international conference on Principles and practice of declaritive programming

Year

2003

DOI

10.1145/888251.888273

URL

http://dx.doi.org/10.1145/888251.888273

Abstract

Non-strict purely functional languages pose many challenges to the designers of debugging tools. Declarative debugging has long been considered a suitable candidate for the task due to its abstraction over the evaluation order of the program, although the provision of practical implementations has been lagging. In this paper we discuss the solutions used in our declarative debugger for Haskell to tackle the problems of printing values, memory usage and I/O. The debugger is based on program transformation, although much leverage is gained by interfacing with the runtime environment of the language implementation through a foreign function interface.

A program transformation for debugging Haskell 98, ACSC '03 Proceedings of the 26th Australasian computer science conference, 2003.

Authors

Bernard Pope, Lee Naish

Journal

ACSC '03 Proceedings of the 26th Australasian computer science conference

Volume

16

Year

2003

URL

https://dl.acm.org/citation.cfm?id=783132

Abstract

We present a source-to-source transformation of Haskell 98 pro-grams for the purpose of debugging. The source code of a program is transformed into a new program which, when executed,computes the value of the original program and a high-level semantics for that computation. The semantics is given by a tree whose nodes represent function applications that were evaluated during execution. This tree is useful in situations where a high-level view of a computation is needed, such as declarative debugging. The main contribution of the paper is the treatment of higher-order functions, which have previously proven difficult to support in declarative debugging schemes.