Causally Abstracted Multi-armed Bandits

Fabio Massimo Zennaro, Nicholas Bishop, Joel Dyer, Yorgos Felekis, Anisoara Calinescu, Michael Wooldridge, Theodoros Damoulas. Published in UAI 2024 (Conference on Uncertainty in Artificial Intelligence), 2024

Multi-armed bandits (MAB) and causal MABs (CMAB) are established frameworks for decision-making problems. The majority of prior work typically studies and solves individual MAB and CMAB in isolation for a given problem and associated data. However, decision-makers are often faced with multiple related problems and multi-scale observations where joint formulations are needed in order to efficiently exploit the problem structures and data dependencies. Transfer learning for CMABs addresses the situation where models are defined on identical variables, although causal connections may differ. In this work, we extend transfer learning to setups involving CMABs defined on potentially different variables, with varying degrees of granularity, and related via an abstraction map. Formally, we introduce the problem of causally abstracted MABs (CAMABs) by relying on the theory of causal abstraction in order to express a rigorous abstraction map. We propose algorithms to learn in a CAMAB, and study their regret. We illustrate the limitations and the strengths of our algorithms on a real-world scenario related to online advertising.

Download here

Learning Consistent Causal Abstractions with Genetic Algorithms

Fabio Massimo Zennaro and Kai Olav Ellefsen. Published in NORA 2024 (Norwegian Artificial Intelligence Research Consortium Conference), 2024

Recent work in machine learning and artificial intelligence has dealt with bringing together two fundamental concepts underlying intelligence: causality and abstraction. Given two SCMs representing the same system at different levels of detail, learning an abstraction between them turned out to be a hard combinatorial problem, and solutions have been proposed relying on neural networks and optimal transport. In this paper, we consider the original problem of abstraction learning and we generalize it. We start highlighting a connection to the standard bin packing problem, and then we show that the problem at hand is actually an instance of a larger class of non-decomposable product of combinatorial problems. We propose an approach to solving these problems relying on genetic algorithms. Compared with a gradient descent approach, genetic algorithms do not require a relaxation of the problem and they can explore different regions of the solution space without getting stuck in sub-optimal minima. Furthermore, we will also show that the compositional form of the problem suits well the schemata hypothesis for genetic algorithm.

Download here

Causal Optimal Transport of Abstractions

Yorgos Felekis, Fabio Massimo Zennaro, Nicola Branchini, Theodoros Damoulas. Published in CLeaR 2024 (Causal Learning and Reasoning), 2024

Causal abstraction (CA) theory establishes formal criteria for relating multiple structural causal models (SCMs) at different levels of granularity by defining maps between them. These maps have significant relevance for real-world challenges such as synthesizing causal evidence from multiple experimental environments, learning causally consistent representations at different resolutions, and linking interventions across multiple SCMs. In this work, we propose COTA, the first method to learn abstraction maps from observational and interventional data without assuming complete knowledge of the underlying SCMs. In particular, we introduce a multi-marginal Optimal Transport (OT) formulation that enforces do-calculus causal constraints, together with a cost function that relies on interventional information. We extensively evaluate COTA on synthetic and real world problems, and showcase its advantages over non-causal, independent and aggregated COTA formulations. Finally, we demonstrate the efficiency of our method as a data augmentation tool by comparing it against the state-of-the-art CA learning framework, which assumes fully specified SCMs, on a real-world downstream task.

Download here

Interventionally Consistent Surrogates for Agent-based Simulators

Joel Dyer, Nicholas Bishop, Yorgos Felekis, Fabio Massimo Zennaro, Anisoara Calinescu, Theodoros Damoulas, Michael Wooldridge. Published in arXiv, 2023

Agent-based simulators provide granular representations of complex intelligent systems by directly modelling the interactions of the system’s constituent agents. Their high-fidelity nature enables hyper-local policy evaluation and testing of what-if scenarios, but is associated with large computational costs that inhibits their widespread use. Surrogate models can address these computational limitations, but they must behave consistently with the agent-based model under policy interventions of interest. In this paper, we capitalise on recent developments on causal abstractions to develop a framework for learning interventionally consistent surrogate models for agent-based simulators. Our proposed approach facilitates rapid experimentation with policy interventions in complex systems, while inducing surrogates to behave consistently with high probability with respect to the agent-based simulator across interventions of interest. We demonstrate with empirical studies that observationally trained surrogates can misjudge the effect of interventions and misguide policymakers towards suboptimal policies, while surrogates trained for interventional consistency with our proposed method closely mimic the behaviour of an agent-based model under interventions of interest.

Download here

Quantifying Consistency and Information Loss for Causal Abstraction Learning

Fabio Massimo Zennaro, Paolo Turrini, Theodoros Damoulas. Published in IJCAI 2023 (International Joint Conference on Artificial Intelligence), 2023

Structural causal models provide a formalism to express causal relations between variables of interest. Models and variables can represent a system at different levels of abstraction, whereby variables and relations may be coarsened and refined according to the need of an agent or a modeller. However, to switch between different levels of abstraction requires evaluating the trade-off between the consistency and the information loss among models at different levels of abstraction. In this paper we introduce a family of interventional measures that an agent or a modeller may use to evaluate such a trade-off. We analyze the properties of these measures, and propose algorithms to evaluate and learn causal abstractions. Finally, we illustrate the flexibility of our setup by empirically showing how different measures and algorithmic choices may lead to different abstractions.

Download here

Jointly Learning Consistent Causal Abstractions Over Multiple Interventional Distributions

Fabio Massimo Zennaro, Máté Drávucz, Geanina Apachitei, W. Dhammika Widanage, Theodoros Damoulas. Published in CLeaR 2023 (Causal Learning and Reasoning) [Oral presentation: 9% acceptance rate], 2023

An abstraction can be used to relate two structural causal models representing the same system at different levels of resolution. Learning abstractions which guarantee consistency with respect to interventional distributions would allow one to jointly reason about evidence across multiple levels of granularity while respecting the underlying cause-effect relationships. In this paper, we introduce a first framework for causal abstraction learning between SCMs based on the formalization of abstraction recently proposed by Rischel (2020). Based on that, we propose a differentiable programming solution that jointly solves a number of combinatorial sub-problems, and we study its performance and benefits against independent and sequential approaches on synthetic settings and on a challenging real-world problem related to electric vehicle battery manufacturing.

Download here

Modeling Penetration Testing with Reinforcement Learning Using Capture-the-Flag Challenges: Trade-offs between Model-free Learning and A Priori Knowledge

Fabio Massimo Zennaro, Laszlo Erdodi. Published in IET Information Security, 2023

Penetration testing is a security exercise aimed at assessing the security of a system by simulating attacks against it. So far, penetration testing has been carried out mainly by trained human attackers and its success critically depended on the available expertise. Automating this practice constitutes a non-trivial problem, as the range of actions that a human expert may attempts against a system and the range of knowledge she relies on to take her decisions are hard to capture. In this paper, we focus our attention on simplified penetration testing problems expressed in the form of capture the flag hacking challenges, and we apply reinforcement learning algorithms to try to solve them. In modeling these capture the flag competitions as reinforcement learning problems we highlight the specific challenges that characterize penetration testing. We observe these challenges experimentally across a set of varied simulations, and we study how different reinforcement learning techniques may help us addressing these challenges. In this way we show the feasibility of tackling penetration testing using reinforcement learning, and we highlight the challenges that must be taken into consideration, and possible directions to solve them.

Download here

Simulating All Archetypes Of SQL Injection Vulnerability Exploitation Using Reinforcement Learning Agents

Åvald Åslaugson Sommervoll, Laszlo Erdodi, Fabio Massimo Zennaro. Published in International Journal of Information Security, 2023

Vulnerabilities such as SQL injection represent a serious challenge to security. While tools with a pre-defined logic are commonly used in the field of penetration testing, the continually-evolving nature of the security challenge calls for models able to learn autonomously from experience. In this paper we build on previous results on the development of reinforcement learning models devised to exploit specific forms of SQL injection, and we design agents that are able to tackle a varied range of SQL injection vulnerabilities, virtually comprising all the archetypes normally considered by experts. We show that our agents, trained on a synthetic environment, perform a transfer of learning among the different SQL injections challenges; in particular, they learn to use their queries to efficiently gain knowledge about multiple vulnerabilities at once. We also introduce a novel and more versatile way to interpret server messages that reduces reliance on expert inputs. Our simulations show the feasibility of our approach which easily deals with a number of homogeneous challenges, as well as some of its limitations when presented with problems having higher degrees of uncertainty.

Towards Computing an Optimal Abstraction for Structural Causal Models

Fabio Massimo Zennaro, Paolo Turrini, Theodoros Damoulas. Published in UAI 2022 Workshop on Causal Representation Learning, 2022

Working with causal models at different levels of abstraction is an important feature of science. Existing work has already considered the problem of expressing formally the relation of abstraction between causal models. In this paper, we focus on the problem of learning abstractions. We start by defining the learning problem formally in terms of the optimization of a standard measure of consistency. We then point out the limitation of this approach, and we suggest extending the objective function with a term accounting for information loss. We suggest a concrete measure of information loss, and we illustrate its contribution to learning new abstractions.

Download here

Abstraction between Structural Causal Models: A Review of Definitions and Properties

Fabio Massimo Zennaro. Published in UAI 2022 Workshop on Causal Representation Learning [Best paper award], 2022

Structural causal models (SCMs) are a widespread formalism to deal with causal systems. A recent direction of research has considered the problem of relating formally SCMs at different levels of abstraction, by defining maps between SCMs and imposing a requirement of interventional consistency. This paper offers a review of the solutions proposed so far, focusing on the formal properties of a map between SCMs, and highlighting the different layers (structural, distributional) at which these properties may be enforced. This allows us to distinguish families of abstractions that may or may not be permitted by choosing to guarantee certain properties instead of others. Such an understanding not only allows to distinguish among proposal for causal abstraction with more awareness, but it also allows to tailor the definition of abstraction with respect to the forms of abstraction relevant to specific applications.

Download here

SQL Injections and Reinforcement Learning: An Empirical Evaluation of the Role of Action Structure

Manuel Del Verme, Åvald Åslaugson Sommervoll, Laszlo Erdodi, Simone Totaro, Fabio Massimo Zennaro. Published in Nordic Conference on Secure IT Systems (NordSec) 2021, 2021

Penetration testing is a central problem in computer security, and recently, the application of machine learning techniques to this topic has gathered momentum. In this paper, we consider the problem of exploiting SQL injection vulnerabilities, and we represent it as a capture-the-flag scenario in which an attacker can submit strings to an input form with the aim of obtaining a flag token representing private information. We then model the attacker as a reinforcement learning agent that interacts with the server to learn an optimal policy leading to an exploit. We compare two agents: a simpler structured agent that relies on significant a priori knowledge and uses high-level actions; and a structureless agent that has minimal a priori knowledge and generates SQL statements. The comparison showcases the feasibility of developing agents that rely on less ad-hoc modeling and illustrates a possible direction to develop agents that may have wide applicability.

Download here

Simulating SQL Injection Vulnerability Exploitation Using Q-Learning Reinforcement Learning Agents

Laszlo Erdodi, Åvald Åslaugson Sommervoll, Fabio Massimo Zennaro. Published in Journal of Information Security and Applications, 2021

In this paper, we propose a first formalization of the process of exploitation of SQL injection vulnerabilities. We consider a simplification of the dynamics of SQL injection attacks by casting this problem as a security capture-the-flag challenge. We model it as a Markov decision process, and we implement it as a reinforcement learning problem. We then deploy different reinforcement learning agents tasked with learning an effective policy to perform SQL injection; we design our training in such a way that the agent learns not just a specific strategy to solve an individual challenge but a more generic policy that may be applied to perform SQL injection attacks against any system instantiated randomly by our problem generator. We analyze the results in terms of the quality of the learned policy and in terms of convergence time as a function of the complexity of the challenge and the learning agent’s complexity. Our work fits in the wider research on the development of intelligent agents for autonomous penetration testing and white-hat hacking, and our results aim to contribute to understanding the potential and the limits of reinforcement learning in a security environment.

Download here

A new decision making model based on Rank Centrality for GDM with fuzzy preference relations

Anis Yazidi, Magdalena Ivanovska, Fabio Massimo Zennaro, Pedro G. Lind, Enrique Herrera Viedma. Published in European Journal of Operational Research, 2021

Preference aggregation in Group Decision Making (GDM) is a substantial problem that has received a lot of research attention. Decision problems involving fuzzy preference relations constitute an important class within GDM. Legacy approaches dealing with the latter type of problems can be classified into indirect approaches, which involve deriving a group preference matrix as an intermediate step, and direct approaches, which deduce a group preference ranking based on individual preference rankings. Although the work on indirect approaches has been extensive in the literature, there is still a scarcity of research dealing with the direct approaches. In this paper we present a direct approach towards aggregating several fuzzy preference relations on a set of alternatives into a single weighted ranking of the alternatives. By mapping the pairwise preferences into transitions probabilities, we are able to derive a preference ranking from the stationary distribution of a stochastic matrix. Interestingly, the ranking of the alternatives obtained with our method corresponds to the optimizer of the Maximum Likelihood Estimation of a particular Bradley-Terry-Luce model. Furthermore, we perform a theoretical sensitivity analysis of the proposed method supported by experimental results and illustrate our approach towards GDM with a concrete numerical example. This work opens avenues for solving GDM problems using elements of probability theory, and thus, provides a sound theoretical fundament as well as plausible statistical interpretation for the aggregation of expert opinions in GDM.

Download here

Stack-based Buffer Overflow Detection using Recurrent Neural Networks

William Arild Dahl, Laszlo Erdodi, Fabio Massimo Zennaro. Published in arXiv, 2020

Detecting vulnerabilities in software is a critical challenge in the development and deployment of applications. One of the most known and dangerous vulnerabilities is stack-based buffer overflows, which may allow potential attackers to execute malicious code. In this paper we consider the use of modern machine learning models, specifically recurrent neural networks, to detect stack-based buffer overflow vulnerabilities in the assembly code of a program. Since assembly code is a generic and common representation, focusing on this language allows us to potentially consider programs written in several different programming languages. Moreover, we subscribe to the hypothesis that code may be treated as natural language, and thus we process assembly code using standard architectures commonly employed in natural language processing. We perform a set of experiments aimed at confirming the validity of the natural language hypothesis and the feasibility of using recurrent neural networks for detecting vulnerabilities. Our results show that our architecture is able to capture subtle stack-based buffer overflow vulnerabilities that strongly depend on the context, thus suggesting that this approach may be extended to real-world setting, as well as to other forms of vulnerability detection.

Download here

Firearm Detection via Convolutional Neural Networks: Comparing a Semantic Segmentation Model Against End-to-End Solutions

Alexander Egiazarov, Fabio Massimo Zennaro and Vasileios Mavroeidis. Published in IEEE Bigdata 2020 Workshop Cyberhunt, 2020

Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents such as terrorism, general criminal offences, or even domestic violence. One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis. In this paper we conduct a comparison between a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation. We evaluated both models from different points of view, including accuracy, computational and data complexity, flexibility and reliability. Our results show that a semantic segmentation model provides considerable amount of flexibility and resilience in the low data environment compared to classical deep model models, although its configuration and tuning presents a challenge in achieving the same levels of accuracy as an end-to-end model.

Download here

The Agent Web Model - Modelling web hacking for reinforcement learning

Laszlo Erdodi, Fabio Massimo Zennaro. Published in International Journal of Information Security, 2020

Website hacking is a frequent attack type used by malicious actors to obtain confidential information, modify the integrity of web pages or make websites unavailable. The tools used by attackers are becoming more and more automated and sophisticated, and malicious machine learning agents seems to be the next development in this line. In order to provide ethical hackers with similar tools, and to understand the impact and the limitations of artificial agents, we present in this paper a model that formalizes web hacking tasks for reinforcement learning agents. Our model, named Agent Web Model, considers web hacking as a capture-the-flag style challenge, and it defines reinforcement learning problems at seven different levels of abstraction. We discuss the complexity of these problems in terms of actions and states an agent has to deal with, and we show that such a model allows to represent most of the relevant web vulnerabilities. Aware that the driver of advances in reinforcement learning is the availability of standardized challenges, we provide an implementation for the first three abstraction layers, in the hope that the community would consider these challenges in order to develop intelligent web hacking agents.

Download here

Using Subjective Logic to Estimate Uncertainty in Multi-Armed Bandit Problems

Fabio Massimo Zennaro, Audun Jøsang. Published in ECML 2020 Workshop on Uncertainty in Machine Learning, 2020

The multi-armed bandit problem is a classical decision-making problem where an agent has to learn an optimal action balancing exploration and exploitation. Properly managing this trade-off requires a correct assessment of uncertainty; in multi-armed bandits, as in other machine learning applications, it is important to distinguish between stochasticity that is inherent to the system (aleatoric uncertainty) and stochasticity that derives from the limited knowledge of the agent (epistemic uncertainty). In this paper we consider the formalism of subjective logic, a concise and expressive framework to express Dirichlet-multinomial models as subjective opinions, and we apply it to the problem of multi-armed bandits. We propose new algorithms grounded in subjective logic to tackle the multi-armed bandit problem, we compare them against classical algorithms from the literature, and we analyze the insights they provide in evaluating the dynamics of uncertainty. Our preliminary results suggest that subjective logic quantities enable useful assessment of uncertainty that may be exploited by more refined agents.

Download here

A Left Realist Critique of the Political Value of Adopting Machine Learning Systems in Criminal Justice

Fabio Massimo Zennaro. Published in ECML 2020 Workshop on Data Science for Social Good, 2020

In this paper we discuss the political value of the decision to adopt machine learning in the field of criminal justice. While a lively discussion in the community focuses on the issue of the social fairness of machine learning systems, we suggest that another relevant aspect of this debate concerns the political implications of the decision of using machine learning systems. Relying on the theory of Left realism, we argue that, from several points of view, modern supervised learning systems, broadly defined as functional learned systems for decision making, fit into an approach to crime that is close to the law and order stance. Far from offering a political judgment of value, the aim of the paper is to raise awareness about the potential implicit, and often overlooked, political assumptions and political values that may be undergirding a decision that is apparently purely technical.

Download here

Towards Further Understanding of Sparse Filtering via Information Bottleneck

Fabio Massimo Zennaro, Ke Chen. Published in arXiv, 2019

In this paper we examine a formalization of feature distribution learning (FDL) in information-theoretic terms relying on the analytical approach and on the tools already used in the study of the information bottleneck (IB). It has been conjectured that the behavior of FDL algorithms could be expressed as an optimization problem over two information-theoretic quantities: the mutual information of the data with the learned representations and the entropy of the learned distribution. In particular, such a formulation was offered in order to explain the success of the most prominent FDL algorithm, sparse filtering (SF). This conjecture was, however, left unproven. In this work, we aim at providing preliminary empirical support to this conjecture by performing experiments reminiscent of the work done on deep neural networks in the context of the IB research. Specifically, we borrow the idea of using information planes to analyze the behavior of the SF algorithm and gain insights on its dynamics. A confirmation of the conjecture about the dynamics of FDL may provide solid ground to develop information-theoretic tools to assess the quality of the learning process in FDL, and it may be extended to other unsupervised learning algorithms.

Download here

Firearm Detection and Segmentation using an Ensemble of Semantic Neural Networks

Alexander Egiazarov, Vasileios Mavroeidis and Fabio Massimo Zennaro. Published in European Intelligence and Security Informatics Conference (EISIC) 2019, 2019

In recent years we have seen an upsurge in terror attacks around the world. Such attacks usually happen in public places with large crowds to cause the most damage possible and get the most attention. Even though surveillance cameras are assumed to be a powerful tool, their effect in preventing crime is far from clear due to either limitation in the ability of humans to vigilantly monitor video surveillance or for the simple reason that they are operating passively. In this paper, we present a weapon detection system based on an ensemble of semantic convolutional neural networks that decomposes the problem of detecting and locating a weapon into a set of smaller problems concerned with the individual component parts of a weapon. This approach has computational and practical advantages: a set of simpler neural networks dedicated to specific tasks requires less computational resources and can be trained in parallel; the overall output of the system given by the aggregation of the outputs of individual networks can be tuned by a user to trade-off false positives and false negatives; finally, according to ensemble theory, the output of the overall system will be robust and reliable even in the presence of weak individual models. We evaluated our system running simulations aimed at assessing the accuracy of individual networks and the whole system. The results on synthetic data and real-world data are promising, and they suggest that our approach may have advantages compared to the monolithic approach based on a single deep convolutional neural network.

Download here

Analyzing and Storing Network Intrusion Detection Data using Bayesian Coresets: A Preliminary Study in Offline and Streaming Settings

Fabio Massimo Zennaro. Published in ECML 2019 Workshop on Machine Learning for CyberSecurity, 2019

In this paper we offer a preliminary study of the application of Bayesian coresets to network security data. Network intrusion detection is a field that could take advantage of Bayesian machine learning in modelling uncertainty and managing streaming data; however, the large size of the data sets often hinders the use of Bayesian learning methods based on MCMC. Limiting the amount of useful data is a central problem in a field like network traffic analysis, where large amount of redundant data can be generated very quickly via packet collection. Reducing the number of samples would not only make learning more feasible, but would also contribute to reduce the need for memory and storage. We explore here the use of Bayesian coresets, a technique that reduces the amount of data samples while guaranteeing the learning of an accurate posterior distribution using Bayesian learning. We analyze how Bayesian coresets affect the accuracy of learned models, and how time-space requirements are traded-off, both in a static scenario and in a streaming scenario.

Download here

Counterfactually Fair Prediction Using Multiple Causal Models

Fabio Massimo Zennaro, Magdalena Ivanovska. Published in 16th European Conference on Multi-Agent Systems (EUMAS), 2018

In this paper we study the problem of making predictions using multiple structural casual models defined by different agents, under the constraint that the prediction satisfies the criterion of counterfactual fairness. Relying on the frameworks of causality, fairness and opinion pooling, we build upon and extend previous work focusing on the qualitative aggregation of causal Bayesian networks and causal models. In order to complement previous qualitative results, we devise a method based on Monte Carlo simulations. This method enables a decision-maker to aggregate the outputs of the causal models provided by different experts while guaranteeing the counterfactual fairness of the result. We demonstrate our approach on a simple, yet illustrative, toy case study.

Download here

An Empirical Evaluation of the Approximation of Subjective Logic Operators Using Monte Carlo Simulations

Fabio Massimo Zennaro, Magdalena Ivanovska, Audun Jøsang. Published in International Journal of Approximate Reasoning, 2018

In this paper we analyze the use of subjective logic as a framework for performing approximate transformations over probability distribution functions. As for any approximation, we evaluate subjective logic in terms of computational efficiency and bias. However, while the computational cost may be easily estimated, the bias of subjective logic operators have not yet been investigated. In order to evaluate this bias, we propose an experimental protocol that exploits Monte Carlo simulations and their properties to assess the distance between the result produced by subjective logic operators and the true result of the corresponding transformation over probability distribution. This protocol allows a modeler to get an estimate of the degree of approximation she must be ready to accept as a trade-off for the computational efficiency and the interpretability of the subjective logic framework. Concretely, we apply our method to the relevant case study of the subjective logic operator for binomial multiplication and we study empirically its approximation.

Download here

Pooling of Causal Models under Counterfactual Fairness via Causal Judgement Aggregation

Fabio Massimo Zennaro, Magdalena Ivanovska. Published in ICML 2018 Workshop on Machine Learning for Causal Inference, Counterfactual Prediction, and Autonomous Action, 2018

In this paper we consider the problem of combining multiple probabilistic causal models, provided by different experts, under the requirement that the aggregated model satisfy the criterion of counterfactual fairness. We build upon the work on causal models and fairness in machine learning, and we express the problem of combining multiple models within the framework of opinion pooling. We propose two simple algorithms, grounded in the theory of counterfactual fairness and causal judgment aggregation, that are guaranteed to generate aggregated probabilistic causal models respecting the criterion of fairness, and we compare their behaviors on a toy case study.

Download here

On the Use of Sparse Filtering for Covariate Shift Adaptation

Fabio Massimo Zennaro, Ke Chen. Published in arXiv, 2016

In this paper we formally analyse the use of sparse filtering algorithms to perform covariate shift adaptation. We provide a theoretical analysis of sparse filtering by evaluating the conditions required to perform covariate shift adaptation. We prove that sparse filtering can perform adaptation only if the conditional distribution of the labels has a structure explained by a cosine metric. To overcome this limitation, we propose a new algorithm, named periodic sparse filtering, and carry out the same theoretical analysis regarding covariate shift adaptation. We show that periodic sparse filtering can perform adaptation under the looser and more realistic requirement that the conditional distribution of the labels has a periodic structure, which may be satisfied, for instance, by user-dependent data sets. We experimentally validate our theoretical results on synthetic data. Moreover, we apply periodic sparse filtering to real-world data sets to demonstrate that this simple and computationally efficient algorithm is able to achieve competitive performances.

Download here

Covariate Shift Adaptation via Sparse Filtering for High-Dimensional Periodic Data

Fabio Massimo Zennaro, Ke Chen. Published in NIPS 2016 Workshop on Learning in High Dimensions with Structure, 2016

In this paper explores a use of sparse filtering algorithms applied to the problem of covariate shift adaptation. We suggest a novel algorithm, periodic sparse filtering, and we consider its application to structured high-dimensional data.

Download here

Towards Understanding Sparse Filtering: A Theoretical Perspective

Fabio Massimo Zennaro, Ke Chen. Published in Neural Networks, 2016

In this paper we present a theoretical analysis to understand sparse filtering, a recent and effective algorithm for unsupervised learning. The aim of this research is not to show whether or how well sparse filtering works, but to understand why and when sparse filtering does work. We provide a thorough theoretical analysis of sparse filtering and its properties, and further offer an experimental validation of the main outcomes of our theoretical analysis. We show that sparse filtering works by explicitly maximizing the entropy of the learned representation through the maximization of the proxy of sparsity, and by implicitly preserving mutual information between original and learned representations through the constraint of preserving a structure of the data, specifically the structure defined by relations of neighborhoodness under the cosine distance. Furthermore, we empirically validate our theoretical results with artificial and real data sets, and we apply our theoretical understanding to explain the success of sparse filtering on real-world problems. Our work provides a strong theoretical basis for understanding sparse filtering: it highlights assumptions and conditions for success behind this feature distribution learning algorithm, and provides insights for developing new feature distribution learning algorithms

Download here

Programme Committee