Hypothesis Generation From Evolving Scientific Corpora - Promises and Challenges

Kishlay Jha
Department of Computer Science
University of Virginia

Abstract Hypothesis generation is a crucial element in making scientific discoveries. Traditionally, scientists form hypotheses based on their intuition, ability to make creative chance connections, prior knowledge and experience. This involves them to selectively read hundreds (sometimes thousands) of articles to develop testable hypotheses. However, in the present data-intensive era, it is infeasible for an individual scientist or a research team to keep up with all the relevant articles published in their area of interest. While technologies based on text summarization would help users get a high level idea of the papers, it fails to stitch together disparate and temporally evolving facts together to present novel and actionable insights that can drive new research frontiers.

To overcome these aforementioned challenges, in our research group, we have been developing a temporally robust computational framework that aims to identify implicit connections between hitherto unknown medical concepts, by modeling their semantic evolution over time. In this talk, I will present the “promises and challenges” of our initial steps taken towards this direction.