BIOINFORMATICS 2024 Abstracts


Full Papers
Paper Nr: 19
Title:

Modeling iPSC-Derived Endothelial Cell Transition in Tumor Angiogenesis Using Petri Nets

Authors:

Adéla Šterberová, Andreea Dincu, Stijn Oudshoorn, Vincent van Duinen and Lu Cao

Abstract: Tumor angiogenesis concerns the development of new blood vessels supplying the necessary nutrients for the further development of existing tumor cells. The entire process is complex, involving the production and consumption of chemicals, endothelial cell transitions as well as cell interactions, divisions, and migrations. Microfluidic cell culture platform has been used to study angiogenesis of endothelial cells derived from human induced pluripotent stem cells (iPSC-ECs) for a physiological relevant micro-environment. In this paper, we elaborate on how to define a pipeline for simulating the transformation and process that an iPSC-derived endothelial cell goes through in this biological scenario. We leverage the robustness and simplicity of Petri nets for modeling the cell transformation and associated constraints. The environmental and spacial factors are added using custom 2-dimensional grids. Although the pipeline does not capture the entire complexity of tumor angiogenesis, we are able to capture the essence of endothelial cell transitions in tumor angiogenesis using this conceptually simplified solution.
Download

Paper Nr: 47
Title:

Assembling Close Strains in Metagenome Assemblies Using Discrete Optimization

Authors:

Tam Khac Minh Truong, Roland Faure and Rumen Andonov

Abstract: Metagenomic assembly is essential for understanding microbial communities but faces challenges in distinguishing conspecific bacterial strains. This is especially true when dealing with low-accuracy sequencing reads such as PacBio CLR and Oxford Nanopore. While these technologies provide unequaled throughput and read length, the high error rate makes it difficult to distinguish close bacterial strains. Consequently, current de novo metagenome assembly methods excel to assemble dominant species but struggle to reconstruct low-abundance strains. In our study, we innovate by approaching strain separation as an Integer Linear Programming (ILP) problem. We introduce a strain-separation module, strainMiner, and integrate it into an established pipeline to create strain-separated assemblies from sequencing data. Across simulated and real experiments encompassing a wide range of error rates (5-12%), our tool consistently compared favorably to the state-of-the-art in terms of assembly quality and strain reconstruction. Moreover, strainMiner substantially cuts down the computational burden of strain-level assembly compared to published software by leveraging the powerful Gurobi solver. We think the new methodological ideas presented in this paper will help democratizing strain-separated assembly.
Download

Paper Nr: 160
Title:

Biologically-Informed Shallow Classification Learning Integrating Pathway Knowledge

Authors:

Julius Voigt, Sascha Saralajew, Marika Kaden, Katrin Sophie Bohnsack, Lynn Reuss and Thomas Villmann

Abstract: We propose a biologically-informed shallow neural network as an alternative to the common knowledge-integrating deep neural network architecture used in bio-medical classification learning. In particular, we focus on the Generalized Matrix Learning Vector Quantization (GMLVQ) model as a robust and interpretable shallow neural classifier based on class-dependent prototype learning and accompanying matrix adaptation for suitable data mapping. To incorporate the biological knowledge, we adjust the matrix structure in GMLVQ according to the pathway knowledge for the given problem. During model training both the mapping matrix and the class prototypes are optimized. Since GMLVQ is fully interpretable by design, the interpretation of the model is straightforward, taking explicit account of pathway knowledge. Furthermore, the robustness of the model is guaranteed by the implicit separation margin optimization realized by means of the stochastic gradient descent learning. We demonstrate the performance and the interpretability of the shallow network by reconsideration of a cancer research dataset, which was already investigated using a biologically-informed deep neural network.
Download

Paper Nr: 164
Title:

USTAR2: Fast and Succinct Representation of k-mer Sets Using De Bruijn Graphs

Authors:

Enrico Rossignolo and Matteo Comin

Abstract: A fundamental operation within the realm of computational genomics revolves around the reduction of input sequences into their constituent k-mers. The development of space-efficient methods to represent a collection of k-mers assumes significant importance in advancing the scalability of bioinformatics analyses. One prevalent strategy involves transforming the set of k-mers into a de Bruijn graph and subsequently devising a streamlined representation of this graph by identifying the smallest path cover. In this article, we introduce USTAR2, a novel algorithm for the compression of k-mers. USTAR2 harnesses the principles of node connectivity in the de Bruijn graph, for a more efficient selection of paths for constructing the path cover. We performed a series of test on the compression of real read datasets, and compared USTAR2 with several other tools. USTAR2 achieved the best performance in terms of compression, it requires less memory and it is also considerably faster (up to 96x). The code of USTAR2 is available at the repository https://github.com/CominLab/USTAR2.
Download

Paper Nr: 172
Title:

Predictive Biomarkers in PD-1/PD-L1 Immunotherapy Response: A Machine Learning Approach Using Gene Sequencing Data

Authors:

Carolina Castaño, Isis Bonet, Joseph Pinto and Jhajaira Araujo

Abstract: Cancer, a leading cause of premature death globally, has seen a surge in new cases, projected to reach 28.4 million by 2040. Immunotherapy with immune checkpoint inhibitors (ICIs) like PD-1/PD-L1 inhibitors presents a promising treatment avenue. However, patient response rates vary, prompting the search for predictive biomarkers. Existing markers, often derived from transcriptomic analyses, exhibit moderate accuracy, hindered by cancer heterogeneity and tissue specificity. Artificial intelligence models, classified into regression, classification, and deep learning, have shown promise. Despite their potential, the limitations of current biomarkers require exploring combined predictions with multiple markers, considering various biological mechanisms. In this study, a machine learning model using RNA sequencing data from 546 patients with urothelial, renal, thymic, melanoma, non-small cell carcinoma, and oral cavity carcinoma from nine different cohorts, obtained in public databases, identified 55 genes influencing response classification. The GradientBoosting model demonstrated superior predictive performance compared to previous reports, with an AUC of 0.95, a recall of 0.84, and a specificity of 0.90. Clustering algorithms using SHapley Additive exPlanations values from the model, revealed nine sample groups, each with a majority class and eight of them associated with different types of cancer, demonstrating the potential for agnostic prediction models.
Download

Paper Nr: 203
Title:

Neural Population Decoding and Imbalanced Multi-Omic Datasets for Cancer Subtype Diagnosis

Authors:

Charles Theodore Kent, Leila Bagheriye and Johan Kwisthout

Abstract: Recent strides in the field of neural computation has seen the adoption of Winner-Take-All (WTA) circuits to facilitate the unification of hierarchical Bayesian inference and spiking neural networks as a neurobiologically plausible model of information processing. Current research commonly validates the performance of these networks via classification tasks, particularly of the MNIST dataset. However, researchers have not yet reached consensus about how best to translate the stochastic responses from these networks into discrete decisions, a process known as population decoding. Despite being an often underexamined part of SNNs, in this work we show that population decoding has a significant impact on the classification performance of WTA networks. For this purpose, we apply a WTA network to the problem of cancer subtype diagnosis from multi-omic data, using datasets from The Cancer Genome Atlas (TCGA). In doing so we utilise a novel implementation of gene similarity networks, a feature encoding technique based on Kohoen’s self-organising map algorithm. We further show that the impact of selecting certain population decoding methods is amplified when facing imbalanced datasets.
Download

Paper Nr: 266
Title:

Deep Learning in Digital Breast Pathology

Authors:

Madison Rose, Joseph Geradts and Nic Herndon

Abstract: The development of scanners capable of whole slide imaging has transformed digital pathology. There have been many benefits to being able to digitize a stained-glass slide from a tissue sample, but perhaps the most impactful one has been the introduction of machine learning in digital pathology. This has the potential to revolutionize the field through increased diagnostic accuracy as well as reduced workload on pathologists. In the last few years, a wide range of machine learning techniques have been applied to various tasks in digital pathology, with deep learning and convolutional neural networks being arguably the most popular choice. Breast cancer, as one of the most common cancers among women worldwide, has been a topic of wide interest since hematoxylin and eosin-stained (H&E)-stained slides can be used for breast cancer diagnosis. This paper summarizes key advancements in digital breast pathology with a focus on whole slide image analysis and provides insight into popular methods to overcome key challenges in the industry.
Download

Short Papers
Paper Nr: 50
Title:

Agent Simulation Using Path Telemetry for Modeling COVID-19 Workplace Hazard and Risk

Authors:

David Beymer, Vandana Mukherjee, Anup Pillai, Hakan Bulu, Vanessa Burrowes, James Kaufman and Ed Seabolt

Abstract: We present a cloud native agent based simulation of disease transmission hazard and risk in a model of a particular workplace. When combined with epidemiological data for employee home counties, the simulation can be used to measure the effect of interventions and building policies on occupational hazard and risk from an infectious disease, and to compare that hazard and risk to the average risk to the employees in their home counties based on current outbreak data. We demonstrate this for two particular interventions, varying the number of employees allowed to work onsite, and enabling/disabling alternate routes at choke points such as cafeteria checkpoints. We discuss how occupational hazard and risk depends strongly on the details of workplace layout and policies and propose how the current simulation (and tools like it) can be used to evaluate policies and procedures for return to work.
Download

Paper Nr: 53
Title:

Compositional Techniques for Asynchronous Boolean Networks

Authors:

Maram Alshahrani and Jason Steggles

Abstract: Asynchronous Boolean networks are an important qualitative modelling approach for analysing and engineering biological systems. However, their practical application is limited by the state space explosion problem and lack of engineering tools. To help address these limitations we develop new compositional techniques for constructing and analysing asynchronous Boolean networks based on the idea of merging entities using Boolean operators. We propose a novel asynchronous interference state graph to model the interference that occurs in a composition and develop a range of important new asynchronous compositional techniques for analysing behavioural preservation and identifying point attractors.
Download

Paper Nr: 85
Title:

Modeling Intestinal Glucose Absorption from D-Xylose Data

Authors:

Danilo Dursoniah, Maxime Folschette, Rebecca Goutchtat, Violeta Raverdy, François Pattou and Cédric Lhoussaine

Abstract: Type 2 Diabetes (T2D) is one of the main epidemics of this century. One of the hypothesis of medical research is that an important cause of T2D may be the abnormal regulation of intestinal glucose absorption (IGA). Early detection of IGA disorders, and, more generally, precision medicine, may help to prevent the risk of T2D. This could be achieved by predictive models of glucose dynamics in blood following an oral ingestion. Even though many such models have been proposed, they either do not cope with IGA at all, or their calibration requires the use of complex and invasive tracer protocols that make them clinically unusable on a daily basis. To overcome this issue, D-xylose may be used as an IGA marker. Indeed, it is a glucose analogue with similar intestinal absorption mechanisms but, contrary to glucose, its dynamics in blood only results from gastric emptying, intestinal absorption and elimination by the kidney. In this paper, we investigate a model-based assessment of IGA based on D-xylose dynamics in blood after oral absorption. We show that a multi-compartment model of instestinal absorption can fit very well D-xylose data obtained from different experimental conditions and be a good qualitative estimate of IGA. Additionnally, because gastric emptying is a possible confounding factor with intestinal absorption, we explore the relative contribution of both mechanisms to the rate of D-xylose (and thus glucose) appearance in blood.
Download

Paper Nr: 128
Title:

Computational Modeling of Arterial Walls: Evaluating Model Complexity and the Influence of Model Parameters on Deformation Outcomes

Authors:

Seda Aslan, Xiaolong Liu, Enze Chen, Miya Mese-Jones, Bryan Gonzalez, Ryan O’Hara, Yue-Hin Loke, Narutoshi Hibino, Laura Olivieri, Axel Krieger and Thao D. Nguyen

Abstract: Computational models have been instrumental in advancing cardiovascular applications, particularly in simulating arterial behaviors for pre-surgical treatment strategies. Nonetheless, uncertainties arising from patient-specific parameters, such as arterial wall thickness and material properties, pose challenges to their precision. This study utilized finite element analysis to simulate the deformation response of the porcine pulmonary artery to a pressure change and performed a sensitivity analysis of the effects of material properties and vessel wall thickness on the deformation. The widely recognized Holzapfel-Gasser-Ogden (HGO) model was used to describe the stress-strain behavior of the arterial wall. Initially, the arterial walls were modeled as a single layer, then as separate adventitia and intima-media layers with constant thickness. The model complexity was increased by varying thickness and specific material properties of different regions in pulmonary arteries, based on ex vivo data from existing literature. For the sensitivity analysis, the HGO model parameters were adjusted within their measured variance to study their impact on deformation. The results showed that a single layer, regionally varying wall thickness is needed to reproduce the in vivo measure strain response of the cardiac cycle. The strain response was also most sensitive to variations in the thickness and isotropic shear modulus of the vessel wall. Using this knowledge, we tuned the model parameters for three porcine models until the deformation results were within 10% of the MRI-measured deformations. This study offers valuable insights to identify key model features for specimen-specific computational modeling of the pulmonary artery, thus providing a foundation for enhancing the realism of soft tissue deformation simulations.
Download

Paper Nr: 153
Title:

Identification of Bistability in Enzymatic Reaction Networks Using Hysteresis Response

Authors:

Takashi Naka

Abstract: Intracellular signaling systems can be viewed as enzymatic reaction networks in which enzymes regulate each other through activation and inactivation, and exhibit various properties such as bistability depending on their regulatory structure and parameter values. In this study, we formulate the intracellular signaling systems as regulatory networks whose nodes are cyclic reaction systems of enzyme activation and inactivation, and propose an evaluation function that can identify bistability with low computational cost. For the purpose of demonstrating its effectiveness, we identified 4- and 5-node regulatory networks that exhibit bistability. Furthermore, the effect of parameter values on bistability was analyzed, suggesting that the regulatory structure is more dominant than parameter values for the emergence of bistability.
Download

Paper Nr: 183
Title:

Evaluating the Performance of Protein Structure Prediction in Detecting Structural Changes of Pathogenic Nonsynonymous Single Nucleotide Variants

Authors:

Hong-Sheng Lai and Chien-Yu Chen

Abstract: Protein structure prediction serves as an efficient tool, saving time and circumventing the need for laborious experimental endeavors. Distinguished methodologies, including AlphaFold, RoseTTAFold, and ESMFold, have proven their precision through rigorous evaluation in the last Critical Assessment of Protein Structure Prediction (CASP14). The success of protein structure prediction raises the following question: Can the prediction tools discern structural alterations resulting from single amino acid changes? In this regard, the objective of this study is to assess the performance of existing structure prediction tools on mutated sequences. In this study, we posited that a specific fraction of the pathogenic nonsynonymous single nucleotide variants (nsSNVs) would experience structural alterations following amino acid mutations. We meticulously assembled an extensive dataset by initially sourcing data from ClinVar and subsequently applying multiple filters, resulting in 2,371 pathogenic nsSNVs. Utilizing UniProt, we acquired reference sequences and generated the corresponding alternative sequences based on variant information. This study performed three tools of structure prediction on both the reference and alternative sequences and expected some structural changes upon mutations. Our findings affirm AlphaFold as the foremost prediction tool presently; nonetheless, our experimental results underscore persistent challenges in accurately predicting structural alterations induced by nonsynonymous SNVs. Discrepancies in predicted structures, when observed, often stem from a lack of confidence in the predictions or the spatial separation between compact domains interrupted by disordered regions, posing challenges to successful alignment. The findings from this study highlight the ongoing challenges in accurately predicting the structure of mutated sequences. To enhance the refinement of prediction models, there is a clear need for additional experimentally determined structures of proteins with nsSNVs in the future.
Download

Paper Nr: 219
Title:

Generation of H&E-Stained Histopathological Images Conditioned on Ki67 Index Using StyleGAN Model

Authors:

Lucia Piatriková, Ivan Cimrák and Dominika Petríková

Abstract: The analysis of tissue staining is a crucial aspect of cancer diagnosis. Hematoxylin and Eosin (H&E) staining captures fundamental morphological structures, while analysing Ki67-stained images provides deeper information about the tissue. However, this method is more expensive and time-consuming. Integrating machine learning techniques into pathologists’ workflow can save time and resources and provide reproducible results without intra- and inter-observer variability. However, the model must be explainable to be applicable in clinical practice. A generative model can add supplementary information that serves as an explanation for model predictions. This paper demonstrates the preliminary results of the conditional StyleGAN model trained on H&E-stained images conditioned on the corresponding Ki67 indexes. In our future research, StyleGAN will be part of a model for the estimation of Ki67 index from H&E staining and will generate explanations for the model’s predictions.
Download

Paper Nr: 236
Title:

Deep Learning in Breast Calcifications Classification: Analysis of Cross-Database Knowledge Transferability

Authors:

Adam Mračko, Ivan Cimrák, Lucia Vanovčanová and Viera Lehotská

Abstract: Study delves into the application of deep learning models for the classification of breast calcifications in mammography images. Initial objective was to investigate various convolutional neural network (CNN) architectures and their influence on model accuracy. ResNet101 emerged as the most effective architecture, although other models exhibited comparable performances. The insights gained were subsequently applied to the main goal, which focused on examining the transferability of knowledge between models trained on digitalized films (Curated Breast Imaging Subset of Digital Database for Screening Mammograph) and those trained on digital mammography images (Optimam Database). Results confirmed the lack of seamless transferability, prompting the creation of a combined dataset for training, significantly improving overall model accuracy to 76.2%. The study also scrutinized instances of incorrect predictions across different models, particularly those posing challenges even for medical professionals. Visualizations using Grad-Cam aided in understanding the models’ decision-making process.
Download

Paper Nr: 237
Title:

Ki67 Expression Classification from HE Images with Semi-Automated Computer-Generated Annotations

Authors:

Dominika Petríková, Ivan Cimrák, Katarína Tobiášová and Lukáš Plank

Abstract: Ki67 protein plays crucial role in cell proliferation and it is considered a good marker for determining the cell growth. In histopathology, it is often assessed by immunohistochemistry (IHC) staining. Even though IHC is considered common practice in clinical diagnosis, it has several limitations such as variability and subjectivity. Meaning interpretation of IHC can be subjective and vary between individuals. Moreover, quantification can be challenging as well as it is cost and time consuming. Therefore neural network models hold promise for improving this area, however they require a large amount of high quality annotated dataset, which is time-consuming and laborious work for experts. In the paper, we employed the proposed semi-automated approach of generating Ki67 score from pairs of hematoxylin and eosin (HE) and IHC slides, which aims to minimize expert assistance. The approach consists of image analysis methods such as clustering optimization for tissue registration. Using a sample of 84 pairs of whole slide images of seminomas tissue stained by HE and IHC, we generated dataset containing approximately 30 thousand labeled patches. On the HE patches annotated by proposed approach, we executed several experiments on fine-tuning neural networks model to predict Ki67 score from HE images.
Download

Paper Nr: 240
Title:

SMT: A High-Performance Approach for Counting Kmers

Authors:

Jader M. C. Garbelini, Danilo Sipoli Sanches, André Yoshiaki Kashiwabara and Aurora T. R. Pozo

Abstract: Motivation: Finding conserved motifs in DNA sequences is a key problem in bioinformatics. The growing availability of large-scale genomic data poses significant challenges for computational biology, particularly in terms of efficiency in analysis, kmer identification, and noise presence. The detection of conserved motifs and patterns in DNA sequences is determinant for understanding gene functions and regulations. Therefore, it is essential to develop a novel approaches and methods that can handle these large volumes of information and provide accurate and fast results. Results: We present SMT, an innovative tool designed to efficiently store and count kmers, optimizing memory usage and computation time. The application of SMT has also proven effective in discovering motifs in CHIP-SEQ data, allowing the identification of conserved regions in sequences. Furthermore, SMT allows exact searches in constant time proportional to the size of k and retrieves the most abundant kmers through a frequency table. This approach facilitates large-scale data analysis and provides important insights into the conserved properties of biological sequences. The application of SMT in motif discovery demonstrates its potential to drive research in bioinformatics and genomics. Availability and implementation: Supplementary data and results are available to provide additional information and support the conclusions. SMT and source code can be found at the following address: https://github.com/jadermcg/smt.
Download

Paper Nr: 241
Title:

Particle and Cell Cluster Separation Based on Inertial Effects in Rectangular Serpentine Channels

Authors:

Michal Mulík and Ivan Cimrák

Abstract: It is well-established that the inertial effect in microfluidics has a significant impact on particle and cell cluster separation. The outcomes are particularly dependent on the channel geometry and the particle and cell suspensions introduced into the channel. In this study, we investigate various combinations related to the size of a curved channel, fluid velocity, and the size and elasticity of clusters. We quantitatively and qualitatively evaluate the behavior of the examined clusters with respect to separation potential. Computational results suggest specific combinations of flow parameters leading to efficient particle and cell cluster separation.
Download

Paper Nr: 246
Title:

Fine-Tuning of Conditional Transformers Improves the Generation of Functionally Characterized Proteins

Authors:

Marco Nicolini, Dario Malchiodi, Alberto Cabri, Emanuele Cavalleri, Marco Mesiti, Alberto Paccanaro, Peter N. Robinson, Justin Reese, Elena Casiraghi and Giorgio Valentini

Abstract: Conditional transformers improve the generative capabilities of large language models (LLMs) by processing specific control tags able to drive the generation of texts characterized by specific features. Recently, a similar approach has been applied to the generation of functionally characterized proteins by adding specific tags to the protein sequence to qualify their functions (e.g., Gene Ontology terms) or other characteristics (e.g., their family or the species which they belong to). In this work, we show that fine tuning conditional transformers, pre-trained on large corpora of proteins, on specific protein families can significantly enhance the prediction accuracy of the pre-trained models and can also generate new potentially functional proteins that could enlarge the protein space explored by the natural evolution. We obtained encouraging results on the phage lysozyme family of proteins, achieving statistically significant better prediction results than the original pre-trained model. The comparative analysis of the primary and tertiary structure of the synthetic proteins generated by our model with the natural ones shows that the resulting fine-tuned model is able to generate biologically plausible proteins. Our results confirm and suggest that fine-tuned conditional transformers can be applied to other functionally characterized proteins for possible industrial and pharmacological applications.
Download

Paper Nr: 113
Title:

ReScore Disease Groups Based on Multiple Machine Learnings Utilizing the Grouping-Scoring-Modeling Approach

Authors:

Emma Qumsiyeh, Miar Yousef and Malik Yousef

Abstract: The integrating of biological prior knowledge for disease gene associations has shown significant promise in discovering new biomarkers with potential translational applications. GediNET is a recent tool that is considered an integrative approach. In this research paper, we aim to enhance the functionality of GediNET by incorporating ten different machine learning algorithms. A critical element of this study involves utilizing the Robust Rank Aggregation method to aggregate all the ranked lists over the cross-validations, suggesting the final ranked significant list of disease groups. The Robust Rank Aggregation is used to re-score disease groups based on multiple machine learning. Moreover, a comprehensive comparative analysis of these ten machine learning algorithms has revealed insights regarding their intrinsic qualities. This facilitates researchers in determining which algorithm is most effective in the context of disease grouping and classification.
Download

Paper Nr: 136
Title:

Visual Insights in Human Cancer Mutational Patterns: Similarity-Based Cancer Classification Using Siamese Networks

Authors:

Rocco Zaccagnino, Clelia De Felice, Marco Russo and Rosalba Zizza

Abstract: In recent years, a number of innovations concerning the diagnosis and treatment of diseases through the application of genomics have opened the door to the detailed analysis of somatic mutation patterns in human cancers. Several AI-based systems have been proposed to identify correlations between mutations and type of cancer. However, the use of AI in Bioinformatics still presents two main limitations: (i) the explainability, i.e., the ability of the methods to partially explain and motivate their behavior, and (ii) the usability, i.e., about the strong limitations that are found in the actual use of such methods in real bio-medical contexts and scenarios. In this work, we propose a novel ML-based cancer-type detection system which integrates explainability and usability techniques. To this aim, we first formulate the cancer-type detection problem using the similarity-based classification paradigm. Then, given a cancer sample, we assume to have a set of somatic mutation features available which can be interpreted as cancer mutational view of the sample itself. Finally, we propose the use of a special Machine Learning model defined for learning similarity functions, namely the Siamese Neural Network (SNN). The proposed SNN learns to take a pair of cancer mutational views as input, and to compute a similarity score that can be used to verify whether such samples are similar or not. Preliminary experiments carried out to assess the effectiveness of the proposed system show high performance reaching f1 score 97.61%, and highlight how the similarity-based classification paradigm could be more suitable than the commonly used classification paradigm for the formulation of the cancer-type detection problem.
Download

Paper Nr: 140
Title:

Detecting Retinal Fundus Image Synthesis by Means of Generative Adversarial Network

Authors:

Francesco Mercaldo, Luca Brunese, Mario Cesarelli, Fabio Martinelli and Antonella Santone

Abstract: The recent introduction of Generative Adversarial Networks has showcased impressive capabilities in producing images that closely resemble genuine ones. As a consequence, concerns have arisen within both the academic and industrial communities regarding the difficulty of distinguishing between counterfeit and authentic images. This matter carries significant importance since images play a crucial role in various fields, such as biomedical image recognition and bioimaging classification. In this paper, we propose a method to discriminate retinal fundus images generated by a Generative Adversarial Network. Following the generation of the bioimages, we employ machine learning to understand whether it is possible to differentiate between real and synthetic retinal fundus images. We consider a Deep Convolutional Generative Adversarial Network, a specific type of Generative Adversarial Network, for retinal fundus image generation. The experimental analysis reveals that even though the generated images are visually indistinguishable from genuine ones, an F-Measure equal to 0.97 is obtained in the discrimination between real and synthetic images. Anyway, this is symptomatic that there are several retinal fundus images that are not classified as such and are thus considered authentic retinal fundus images.
Download

Paper Nr: 178
Title:

Semantic Textual Similarity Assessment in Chest X-ray Reports Using a Domain-Specific Cosine-Based Metric

Authors:

Sayeh Gholipour Picha, Dawood Al Chanti and Alice Caplier

Abstract: Medical language processing and deep learning techniques have emerged as critical tools for improving health-care, particularly in the analysis of medical imaging and medical text data. These multimodal data fusion techniques help to improve the interpretation of medical imaging and lead to increased diagnostic accuracy, informed clinical decisions, and improved patient outcomes. The success of these models relies on the ability to extract and consolidate semantic information from clinical text. This paper addresses the need for more robust methods to evaluate the semantic content of medical reports. Conventional natural language processing approaches and metrics are initially designed for considering the semantic context in the natural language domain and machine translation, often failing to capture the complex semantic meanings inherent in medical content. In this study, we introduce a novel approach designed specifically for assessing the semantic similarity between generated medical reports and the ground truth. Our approach is validated, demonstrating its efficiency in assessing domain-specific semantic similarity within medical contexts. By applying our metric to state-of-the-art Chest X-ray report generation models, we obtain results that not only align with conventional metrics but also provide more contextually meaningful scores in the considered medical domain.
Download

Paper Nr: 206
Title:

The Interactive Network Visualization of the Interactions Between Topologically Associating Domains in the Genome of Fruit Fly

Authors:

Samira Mali and Swetha Annavarapu

Abstract: In this work, we created a network visualization to help you understand how Topologically Associating Domains (TADs) interact with each other across the genome based on where the TAD is located, whether it is near the center of the nucleus or near the edge of the nucleus. This visualization can reveal how the dense regions and sparse regions of chromosome interactions are distributed in one view. The pilot study demonstrates how network visualization of TAD-TAD interactions can quickly answer numerous major questions in 3D genome and epigenetics field without requiring the development of Machine Learning methods or Algorithms to unlock Heatmap structures. The questions include but are not limited to, determining many-way interactions and interactions between TADs belonging to various epigenetic classes.
Download

Paper Nr: 222
Title:

Formal Analysis of Uncertain Continuous Markov Chains in Systems Biology

Authors:

Krishnendu Ghosh and Caroline Goodman

Abstract: Data dependent abstraction for continuous-time Markov chain is often challenging given the incompleteness and imprecision of data. Uncertainty in the environment is modeled in the form of uncertain continuous-time Markov chain. In this work, a tractable model checking methodology, stochastic partial model set checking is formalized by approximation of the uncertain continuous-time Markov chain. The methodology was applied in querying to infer on a phylogenetic tree, constructed under uncertainty. Queries were posed on the formalism using continuous stochastic logic formula. Experimental results demonstrate the computational feasibility of the model.
Download