Interactive Online Learning for Intelligent Systems

Qingyun Wu
Department of Computer Science
University of Virginia

  • Date: Friday November 8th, 2019
  • Time: 12:00PM
  • Location: Rice 242

Abstract The past several years have witnessed a growing need for intelligent systems, such as recommender systems and intelligent control systems in CPS, that work in real-time to satisfy people’s various needs. Due to the heterogeneity and dynamic nature of a large population of users in most of the information service systems, a generic offline trained algorithm can hardly satisfy each individual user’s need, which calls for interactive online learning solutions. Online learning solutions explore the unknowns by sequentially collect individual user’s feedback, which helps address the notorious explore/exploit dilemma during sequential decision. However, the wide spectrum application scenario also brings in new challenges to interactive online learning, such as the existence of temporal dynamics in the online learning environment, and the effect of collaboration between users/environments. In this talk, I will talk about our most recent interactive online learning solutions based on contextual bandits, to address these challenges. More specifically, I will introduce our collaborative multi-armed bandits and non-stationary bandits, which leverage collaborative effects and temporal dynamics during interactive online learning in real-world systems.

Manipulating Learning Algorithms in Strategic Environments

Haifeng Xu
Department of Computer Science
University of Virginia

  • Date: Friday November 1st, 2019
  • Time: 12:00PM
  • Location: Rice 242

Abstract There has been significant amount of recent interests in adversarial attacks to machine learning algorithms, particularly deep learning algorithms. In this talk, we pursue a closely related, yet far less explored, theme alone this research agenda, i.e., strategic attacks to learning algorithms. In particular, we consider settings where the learner faces a strategic agent who manipulates the learning algorithm simply to optimize his own utility, as opposed to completely ruining the learner’s algorithm in adversarial ML. Such strategic interactions naturally arise in many decision-focused learning tasks including, e.g., learning to set a price for an unknown buyer and learning to defend against an unknown attacker. We describe a general framework to theoretically analyze the attacker’s optimal strategic attack, and then instantiate the framework and analysis in two basic scenarios. Finally, we consider how to defend against such strategic attacks and provide formal barriers to the design of optimal defense for the learner.

Adversarial Examples from Concentration of Measure

Saeed Mahloujifar
Department of Computer Science
University of Virginia

  • Date: Friday October 25th, 2019
  • Time: 12:00PM
  • Location: Rice 242

Abstract Many recent works have shown that adversarial instances that fool classifiers can be found by adding a small perturbation to a normally sampled input. In this talk, I will present a connection between adversarial examples and the well-known phenomenon of “concentration of measure” in high dimensional metric probability spaces. In two recent work [1,2] we show that if the metric probability space of the test instance is concentrated, any classifier with some initial constant error is inherently vulnerable to adversarial perturbations, if the perturbation is allowed to be sublinear in input’s total size. Although many theoretically natural probability spaces (e.g. isotropic Gaussian) are known to be concentrated, it is not clear whether these theoretical results apply to actual distributions such as images. I will also discuss results from a more recent work [3], in which we present a method for empirically measuring and bounding the concentration of a concrete dataset which is proven to converge to the actual concentration.

Reference

[1] Diochnos, D., Mahloujifar, S., & Mahmoody, M. (2018). Adversarial risk and robustness: General definitions and implications for the uniform distribution. In Advances in Neural Information Processing Systems (pp. 10359-10368).
[2] Mahloujifar, S., Diochnos, D. I., & Mahmoody, M. (2019, July). The curse of concentration in robust learning: Evasion and poisoning attacks from concentration of measure. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 4536-4543).
[3] Mahloujifar, S., Zhang, X., Mahmoody, M., & Evans, D. (2019). Empirically Measuring Concentration: Fundamental Limits on Intrinsic Robustness. To appear in Advances in Neural Information Processing Systems.

Two Short Talks from Prof. Yanjun Qi's Group

  • Date: Friday October 18, 2019
  • Time: 12:00PM
  • Location: Rice 242

Title: Neural Message Passing for Multi-Label Classification
Speaker: Jack Lanchantin

Abstract: Multi-label classification (MLC) is the task of assigning a set of target labels for a given sample. Modeling the combinatorial label interactions in MLC has been a long-haul challenge. We propose Label Message Passing (LaMP) Neural Networks to efficiently model the joint prediction of multiple labels. LaMP treats labels as nodes on a label-interaction graph and computes the hidden representation of each label node conditioned on the input using attention-based neural message passing. Attention enables LaMP to assign different importance to neighbor nodes per label, learning how labels interact (implicitly). The proposed models are simple, accurate, interpretable, structure-agnostic, and applicable for predicting dense labels since LaMP is incredibly parallelizable. We validate the benefits of LaMP on seven real-world MLC datasets, covering a broad spectrum of input/output types and outperforming the state-of-the-art results. Notably, LaMP enables intuitive interpretation of how classifying each label depends on the elements of a sample and at the same time rely on its interaction with other labels.

Title: FastGSK: Fast and Efficient SequenceAnalysis using Gapped String Kernels
Speaker: Derrick Blakely

Abstract: Character-level string kernel methods achieve strong classification performance on DNA, protein, and text data using modestly-sized training sets. However, existing methods suffer from slow kernel computation time and overfitting, owing to the exponential dependence between the alphabet size and n-gram length. In this work, we introduce a new character-level string kernel algorithm using gapped n-grams, called FastGSK. We formulate the kernel function as a series of independent counting operations, which we sample to obtain a fast approximation algorithm. This work enables state-of-the-art string kernel methods to scale to any alphabet size (DNA, protein, or natural language) and use longer features. Moreover, we use a modern software architecture with a multithreaded implementation backend and a PyPi package frontend. We experimentally show that FastGSK matches or outperforms existing string kernel methods, as well as recurrent neural networks, across a variety of sequence analysis tasks.

Three Short Talks from Prof. Daniel Weller's Group

  • Date: Friday October 11, 2019
  • Time: 12:00PM
  • Location: Rice 242

Title: Optimizing regularization parameters of image processing algorithms through machine learning
Speaker: Tanjin Taher Toma

Abstract: In image and video processing problems (e.g., enhancement, reconstruction, segmentation, etc.), algorithms often have regularization parameters that need to be set appropriately to obtain good results. Existing automatic parameter selection techniques are mostly iterative and often relies on a predetermined image metric (such as, image quality, or risk estimate) to estimate parameter value. In this talk, we discuss our convolutional neural network approach for direct parameter estimation that demonstrates effectiveness over existing methods.

Title: Myocardial T1 mapping with convolutional neural networks
Speaker: Haris Jeelani

Abstract: The longitudinal relaxation time (T1) of the hydrogen protons in the heart wall can be used as an indicator for a variety of pathological conditions. Traditionally, a pixel-wise nonlinear model fitting sensitive to noise is used to obtain T1 maps. As discussed in my previous AIML talk, to increase the noise robustness we were using a convolutional neural network framework (DeepT1). In this talk I will discuss the updates we made to our DeepT1 framework. The updated model includes a recurrent and a U-net model to improve the performance of T1-map estimation. This is joint work work Dr. Michael Salerno and Dr. Christopher Kramer (Cardiology) and Dr. Yang Yang (now, Mount Sinai School of Medicine).

Title: Examining Working Memory Representations for Neural Networks Trained to Play Games
Speaker: Tyler Spears (supervised by Per Sederberg, Psychology)

Abstract: The current success of deep learning is owed, in no small part, to the field’s roots in cognitive neuroscience. In this work, we examine the properties of several human-based models of working memory (WM), and analyze their computational utility when combined with deep neural networks. We then put forth the Scale-Invariant Temporal History (SITH) model, an applied variant of a WM model recently proposed in the cognitive neuroscience literature. Finally, we discuss future applications of SITH in artificial intelligence, as well as the future of neurally-inspired machine learning methods. This work was supervised by Prof. Per Sederberg (Psychology).