IdMotif: An Interactive Motif Identification in Protein Sequences
 Ji Hwan Park -
 Vikash Prasad -
 Sydney Newsom -
 Fares Najar -
 Rakhi Rajan -

 Screen-reader Accessible PDF
 Download preprint PDF
 DOI: 10.1109/MCG.2023.3345742
Room: Hall E1
Keywords
Proteins, Predictive models, Biological system modeling, Protein sequence, Amino acids, Transformers, Computational modeling, Visual analytics, Interactive systems
Abstract
This article presents a visual analytics framework, idMotif, to support domain experts in identifying motifs in protein sequences. A motif is a short sequence of amino acids usually associated with distinct functions of a protein, and identifying similar motifs in protein sequences helps us to predict certain types of disease or infection. idMotif can be used to explore, analyze, and visualize such motifs in protein sequences. We introduce a deep-learning-based method for grouping protein sequences and allow users to discover motif candidates of protein groups based on local explanations of the decision of a deep-learning model. idMotif provides several interactive linked views for between and within protein cluster/group and sequence analysis. Through a case study and experts’ feedback, we demonstrate how the framework helps domain experts analyze protein sequences and motif identification.