I am a Computer Science PhD student at Purdue University. In 2015, I graduated with an integrated B.Tech/M.Tech degree in Computer Science from Indian Institute of Technology Madras, where I worked with Sutanu Chakraborti. After briefly working for Adobe Systems, I joined Purdue where I primarily work with Bruno Ribeiro. My research interests lie in improving out-of-distribution robustness of deep neural networks, particularly by incorporating domain knowledge such as transformation invariances or physics models. |
Publications
SC Mouli, B Ribeiro, Asymmetry Learning for Counterfactually-invariant Classification in OOD Tasks, In International Conference on Learning Representations (ICLR), 2022 (Oral). [Paper]
Summary: Generalizing from observed to new related environments (out-of-distribution) is central to the reliability of classifiers. However, most classifiers fail to predict label Y from input X when the change in environment is due a (stochastic) input transformation not observed in training. This work argues that when the transformations in train and test are (arbitrary) symmetry transformations induced by a collection of known m equivalence relations, the task of finding a robust OOD classifier can be defined as finding the simplest causal model that defines a causal connection between the target labels and the symmetry transformations that are associated with label changes. We then propose a new learning paradigm, asymmetry learning, that identifies which symmetries the classifier must break in order to correctly predict Y in both train and test. Asymmetry learning performs a causal model search that, under certain identifiability conditions, finds classifiers that perform equally well in-distribution and out-of-distribution. Finally, we show how to learn counterfactually-invariant representations with asymmetry learning in two simulated physics tasks and six image classification tasks.
Summary: Generalizing from observed to new related environments (out-of-distribution) is central to the reliability of classifiers. However, most classifiers fail to predict label Y from input X when the change in environment is due a (stochastic) input transformation not observed in training. This work argues that when the transformations in train and test are (arbitrary) symmetry transformations induced by a collection of known m equivalence relations, the task of finding a robust OOD classifier can be defined as finding the simplest causal model that defines a causal connection between the target labels and the symmetry transformations that are associated with label changes. We then propose a new learning paradigm, asymmetry learning, that identifies which symmetries the classifier must break in order to correctly predict Y in both train and test. Asymmetry learning performs a causal model search that, under certain identifiability conditions, finds classifiers that perform equally well in-distribution and out-of-distribution. Finally, we show how to learn counterfactually-invariant representations with asymmetry learning in two simulated physics tasks and six image classification tasks.
SC Mouli, B Ribeiro, Neural Networks for Learning Counterfactual G-Invariances from Single Environments, In International Conference on Learning Representations (ICLR), 2021. [Paper] [Code]
Summary: This work introduces a novel learning framework for single-environment extrapolations, where invariance to transformation groups is mandatory even without evidence, unless the learner deems it inconsistent with the training data. Given m transformation groups, we build a lattice of subspaces such that any weight vector in a subspace is invariant only to a subset of these m groups. We build a neural network using the basis vectors of these subspaces and impose a penalty on the weights of the neural network to enforce maximal invariance that can be achieved without affecting training performance.
Summary: This work introduces a novel learning framework for single-environment extrapolations, where invariance to transformation groups is mandatory even without evidence, unless the learner deems it inconsistent with the training data. Given m transformation groups, we build a lattice of subspaces such that any weight vector in a subspace is invariant only to a subset of these m groups. We build a neural network using the basis vectors of these subspaces and impose a penalty on the weights of the neural network to enforce maximal invariance that can be achieved without affecting training performance.
M Minaei*, SC Mouli*, M Mondal, B Ribeiro, A Kate, Deceptive Deletions for Protecting Withdrawn Posts on Social Media Platforms, In Network and Distributed System Security Symposium (NDSS), 2021. [Paper] [Code]
Summary: Over-sharing poorly-worded thoughts and personal information is prevalent on online social platforms. In many of these cases, users regret posting such content and delete them. Unfortunately, these deletions make users more susceptible to privacy violations by malicious actors who specifically hunt post deletions at large scale via archival services. We introduce Deceptive Deletion, a mechanism to inject decoy deletions, hence creating a two-player minmax game between an adversary that seeks to identify damaging content among the deleted posts and a challenger that employs decoy deletions to masquerade real damaging deletions.
Summary: Over-sharing poorly-worded thoughts and personal information is prevalent on online social platforms. In many of these cases, users regret posting such content and delete them. Unfortunately, these deletions make users more susceptible to privacy violations by malicious actors who specifically hunt post deletions at large scale via archival services. We introduce Deceptive Deletion, a mechanism to inject decoy deletions, hence creating a two-player minmax game between an adversary that seeks to identify damaging content among the deleted posts and a challenger that employs decoy deletions to masquerade real damaging deletions.
SC Mouli, L Teixeira, J Neville, B Ribeiro, Deep Lifetime Clustering, arXiv:1910.00547, 2019. [Paper] [Code]
Summary: The goal of lifetime clustering is to develop an inductive model that maps subjects into clusters according to their underlying (unobserved) lifetime distribution. We introduce a neural-network based lifetime clustering model that can find cluster assignments by directly maximizing the divergence between the empirical lifetime distributions of the clusters via a tight upper bound of the two-sample Kuiper test p-value.
Summary: The goal of lifetime clustering is to develop an inductive model that maps subjects into clusters according to their underlying (unobserved) lifetime distribution. We introduce a neural-network based lifetime clustering model that can find cluster assignments by directly maximizing the divergence between the empirical lifetime distributions of the clusters via a tight upper bound of the two-sample Kuiper test p-value.
C Meng, SC Mouli, B Ribeiro, J Neville, Subgraph pattern neural networks for high-order graph evolution prediction, Thirty-Second AAAI Conference on Artificial Intelligence, 2018. [Paper] [Code]
Summary: In this work we generalize traditional node/link prediction tasks in dynamic heterogeneous networks, to consider joint prediction over larger k-node induced subgraphs. The key insight is to incorporate the dependencies in the training observations of induced subgraphs into both the input features and the model architecture itself via high-order dependencies, resulting in a representation that is invariant to isomorphisms.
Summary: In this work we generalize traditional node/link prediction tasks in dynamic heterogeneous networks, to consider joint prediction over larger k-node induced subgraphs. The key insight is to incorporate the dependencies in the training observations of induced subgraphs into both the input features and the model architecture itself via high-order dependencies, resulting in a representation that is invariant to isomorphisms.
SC Mouli, S Chakraborti, Making the Most of Preference Feedback by Modeling Feature Dependencies, Proceedings of the 9th ACM Conference on Recommender Systems, 2015. [Paper]
Summary: In conversational recommender systems based on preference-based feedback, the user selects the most preferred item from a list of recommended products, and the system updates its recommendations. In this paper, we model the user's preferences while exploiting the dependencies between the product features to build a robust user preference model.
Summary: In conversational recommender systems based on preference-based feedback, the user selects the most preferred item from a list of recommended products, and the system updates its recommendations. In this paper, we model the user's preferences while exploiting the dependencies between the product features to build a robust user preference model.