Publications

What is a Publication?
6 Publications visible to you, out of a total of 6

Abstract (Expand)

Background/Objectives: Massively parallel sequencing technologies have advanced chronic lymphocytic leukemia (CLL) diagnostics and precision oncology. Illumina platforms, while offering robust performance, require substantial infrastructure investment and a large number of samples for cost-efficiency. Conversely, third-generation long-read nanopore sequencing from Oxford Nanopore Technologies (ONT) can significantly reduce sequencing costs, making it a valuable tool in resource-limited settings. However, nanopore sequencing faces challenges with lower accuracy and throughput than Illumina platforms, necessitating additional computational strategies. In this paper, we demonstrate that integrating publicly available short-read data with in-house generated ONT data, along with the application of machine learning approaches, enables the characterization of the CLL transcriptome landscape, the identification of clinically relevant molecular subtypes, and the assignment of these subtypes to nanopore-sequenced samples. Methods: Public Illumina RNA sequencing data for 608 CLL samples were obtained from the CLL-Map Portal. CLL transcriptome analysis, gene module identification, and transcriptomic subtype classification were performed using the oposSOM R package for high-dimensional data visualization with self-organizing maps. Eight CLL patients were recruited from the Hematology Center After Prof. R. Yeolyan (Yerevan, Armenia). Sequencing libraries were prepared from blood total RNA using the PCR-cDNA sequencing-barcoding kit (SQK-PCB109) following the manufacturer's protocol and sequenced on an R9.4.1 flow cell for 24-48 h. Raw reads were converted to TPM values. These data were projected into the SOMs space using the supervised SOMs portrayal (supSOM) approach to predict the SOMs portrait of new samples using support vector machine regression. Results: The CLL transcriptomic landscape reveals disruptions in gene modules (spots) associated with T cell cytotoxicity, B and T cell activation, inflammation, cell cycle, DNA repair, proliferation, and splicing. A specific gene module contained genes associated with poor prognosis in CLL. Accordingly, CLL samples were classified into T-cell cytotoxic, immune, proliferative, splicing, and three mixed types: proliferative-immune, proliferative-splicing, and proliferative-immune-splicing. These transcriptomic subtypes were associated with survival orthogonal to gender and mutation status. Using supervised machine learning approaches, transcriptomic subtypes were assigned to patient samples sequenced with nanopore sequencing. Conclusions: This study demonstrates that the CLL transcriptome landscape can be parsed into functional modules, revealing distinct molecular subtypes based on proliferative and immune activity, with important implications for prognosis and treatment that are orthogonal to other molecular classifications. Additionally, the integration of nanopore sequencing with public datasets and machine learning offers a cost-effective approach to molecular subtyping and prognostic prediction, facilitating more accessible and personalized CLL care.

Authors: A. Arakelyan, T. Sirunyan, G. Khachatryan, S. Hakobyan, A. Minasyan, M. Nikoghosyan, M. Hakobyan, A. Chavushyan, G. Martirosyan, Y. Hakobyan, H. Binder

Date Published: 13th Mar 2025

Publication Type: Journal

Abstract (Expand)

Telomeres, protective caps at chromosome ends, maintain genomic stability and control cell lifespan. Dysregulated telomere maintenance mechanisms (TMMs) are cancer hallmarks, enabling unchecked cell proliferation. We conducted a pan-cancer evaluation of TMM using RNA sequencing data from The Cancer Genome Atlas for 33 different cancer types and analyzed the activities of telomerase-dependent (TEL) and alternative lengthening of telomeres (ALT) TMM pathways in detail. To further characterize the TMM profiles, we categorized the tumors based on their ALT and TEL TMM pathway activities into five major phenotypes: ALT (high) TEL (low), ALT (low) TEL (low), ALT (middle) TEL (middle), ALT (high) TEL (high), and ALT (low) TEL (high). These phenotypes refer to variations in telomere maintenance strategies, shedding light on the heterogeneous nature of telomere regulation in cancer. Moreover, we investigated the clinical implications of TMM phenotypes by examining their associations with clinical characteristics and patient outcomes. Specific TMM profiles were linked to specific survival patterns, emphasizing the potential of TMM profiling as a prognostic indicator and aiding in personalized cancer treatment strategies. Gene ontology analysis of the TMM phenotypes unveiled enriched biological processes associated with cell cycle regulation (both TEL and ALT), DNA replication (TEL), and chromosome dynamics (ALT) showing that telomere maintenance is tightly intertwined with cellular processes governing proliferation and genomic stability. Overall, our study provides an overview of the complexity of transcriptional regulation of telomere maintenance mechanisms in cancer.

Authors: M. Hakobyan, H. Binder, A. Arakelyan

Date Published: 2nd Jul 2024

Publication Type: Journal

Abstract (Expand)

Most high throughput genomic data analysis pipelines currently rely on over-representation or gene set enrichment analysis (ORA/GSEA) approaches for functional analysis. In contrast, topology-based pathway analysis methods, which offer a more biologically informed perspective by incorporating interaction and topology information, have remained underutilized and inaccessible due to various limiting factors. These methods heavily rely on the quality of pathway topologies and often utilize predefined topologies from databases without assessing their correctness. To address these issues and make topology-aware pathway analysis more accessible and flexible, we introduce the PSF (Pathway Signal Flow) toolkit R package. Our toolkit integrates pathway curation and topology-based analysis, providing interactive and command-line tools that facilitate pathway importation, correction, and modification from diverse sources. This enables users to perform topology-based pathway signal flow analysis in both interactive and command-line modes. To showcase the toolkit's usability, we curated 36 KEGG signaling pathways and conducted several use-case studies, comparing our method with ORA and the topology-based signaling pathway impact analysis (SPIA) method. The results demonstrate that the algorithm can effectively identify ORA enriched pathways while providing more detailed branch-level information. Moreover, in contrast to the SPIA method, it offers the advantage of being cut-off free and less susceptible to the variability caused by selection thresholds. By combining pathway curation and topology-based analysis, the PSF toolkit enhances the quality, flexibility, and accessibility of topology-aware pathway analysis. Researchers can now easily import pathways from various sources, correct and modify them as needed, and perform detailed topology-based pathway signal flow analysis. In summary, our PSF toolkit offers an integrated solution that addresses the limitations of current topology-based pathway analysis methods. By providing interactive and command-line tools for pathway curation and topology-based analysis, we empower researchers to conduct comprehensive pathway analyses across a wide range of applications.

Authors: S. Hakobyan, A. Stepanyan, L. Nersisyan, H. Binder, A. Arakelyan

Date Published: 8th Sep 2023

Publication Type: Journal

Abstract (Expand)

The molecular mechanisms of the liver metastasis of colorectal cancer (CRLM) remain poorly understood. Here, we applied machine learning and bioinformatics trajectory inference to analyze a gene expression dataset of CRLM. We studied the co-regulation patterns at the gene level, the potential paths of tumor development, their functional context, and their prognostic relevance. Our analysis confirmed the subtyping of five liver metastasis subtypes (LMS). We provide gene-marker signatures for each LMS, and a comprehensive functional characterization that considers both the hallmarks of cancer and the tumor microenvironment. The ordering of CRLMs along a pseudotime-tree revealed a continuous shift in expression programs, suggesting a developmental relationship between the subtypes. Notably, trajectory inference and personalized analysis discovered a range of epigenetic states that shape and guide metastasis progression. By constructing prognostic maps that divided the expression landscape into regions associated with favorable and unfavorable prognoses, we derived a prognostic expression score. This was associated with critical processes such as epithelial-mesenchymal transition, treatment resistance, and immune evasion. These factors were associated with responses to neoadjuvant treatment and the formation of an immuno-suppressive, mesenchymal state. Our machine learning-based molecular profiling provides an in-depth characterization of CRLM heterogeneity with possible implications for treatment and personalized diagnostics.

Authors: O. Ashekyan, N. Shahbazyan, Y. Bareghamyan, A. Kudryavzeva, D. Mandel, M. Schmidt, H. Loeffler-Wirth, M. Uduman, D. Chand, D. Underwood, G. Armen, A. Arakelyan, L. Nersisyan, H. Binder

Date Published: 28th Jul 2023

Publication Type: Journal

Abstract (Expand)

Multi-omics high-throughput technologies produce data sets which are not restricted to only one but consist of multiple omics modalities, often as patient-matched tumour specimens. The integrative analysis of these omics modalities is essential to obtain a holistic view on the otherwise fragmented information hidden in this data. We present an intuitive method enabling the combined analysis of multi-omics data based on self-organizing maps machine learning. It "portrays" the expression, methylation and copy number variations (CNV) landscapes of each tumour using the same gene-centred coordinate system. It enables the visual evaluation and direct comparison of the different omics layers on a personalized basis. We applied this combined molecular portrayal to lower grade gliomas, a heterogeneous brain tumour entity. It classifies into a series of molecular subtypes defined by genetic key lesions, which associate with large-scale effects on DNA methylation and gene expression, and in final consequence, drive with cell fate decisions towards oligodendroglioma-, astrocytoma- and glioblastoma-like cancer cell lineages with different prognoses. Consensus modes of concerted changes of expression, methylation and CNV are governed by the degree of co-regulation within and between the omics layers. The method is not restricted to the triple-omics data used here. The similarity landscapes reflect partly independent effects of genetic lesions and DNA methylation with consequences for cancer hallmark characteristics such as proliferation, inflammation and blocked differentiation in a subtype specific fashion. It can be extended to integrate other omics features such as genetic mutation, protein expression data as well as extracting prognostic markers.

Authors: H. Binder, M. Schmidt, L. Hopp, S. Davitavyan, A. Arakelyan, H. Loeffler-Wirth

Date Published: 4th Jun 2022

Publication Type: Journal

Abstract (Expand)

organizing maps portraying has been proven to be a powerful approach for analysis of transcriptomic, genomic, epigenetic, single-cell, and pathway-level data as well as for “multi-omic” integrative analyses. However, the SOM method has a major disadvantage: it requires the retraining of the entire dataset once a new sample is added, which can be resource- and time-demanding. It also shifts the gene landscape, thus complicating the interpretation and comparison of results. To overcome this issue, we have developed two approaches of transfer learning that allow for extending SOM space with new samples, meanwhile preserving its intrinsic structure. The extension SOM (exSOM) approach is based on adding secondary data to the existing SOM space by “meta-gene adaptation”, while supervised SOM portrayal (supSOM) adds support vector machine regression model on top of the original SOM algorithm to “predict” the portrait of a new sample. Both methods have been shown to accurately combine existing and new data. With simulated data, exSOM outperforms supSOM for accuracy, while supSOM significantly reduces the computing time and outperforms exSOM for this parameter. Analysis of real datasets demonstrated the validity of the projection methods with independent datasets mapped on existing SOM space. Moreover, both methods well handle the projection of samples with new characteristics that were not present in training datasets.

Authors: Maria Nikoghosyan, Henry Loeffler-Wirth, Suren Davidavyan, Hans Binder, Arsen Arakelyan

Date Published: 27th Dec 2021

Publication Type: Journal

Powered by
(v.1.15.0-main)
Copyright © 2008 - 2024 The University of Manchester and HITS gGmbH