Posts by Collection

classes

Unsupervised Learning

Published:

This class offers an introduction to the topic of unsupervised learning. It first contrasts the problem of unsupervised learning to the problem of supervised learning. It presents some conceptual tools (representations, structure) to formalize and reason about the unsupervised learning problem. These concepts are then used to discuss several prototypical unsupervised learning tasks, from clustering to generative modelling. Finally three paradigmatic unsupervised learning algorithms (PCA, k-means, and autoencoders) are analyzed and evaluated.

Fair ML and Causal ML

Published:

These slides are a contribution to a class on future developments of artificial intelligence and machine learning. The first set of slides introduces the problem of fair decision-making in machine learning systems, and illustrates some sample solutions underlining ideas and flaws. The second set of slides discusses the problem of assessing causal relations in data, and points to solutions in the direction of causal graphical models.

Python for Machine Learning

Published:

This class offers an introduction to machine learning using python. After a brief review of the main concepts in machine learning, the class discusses the main python environment to run and train machine learning models. Examples of basic applications of machine learning models (linear regression, logistic regression, k-means, classification trees, neural networks) are then illustrated using sckikit-learn. Finally, references for further learning are provided.

notes

portfolio

publications

Towards Understanding Sparse Filtering: A Theoretical Perspective

Fabio Massimo Zennaro, Ke Chen. Published in Neural Networks, 2016

In this paper we present a theoretical analysis to understand sparse filtering, a recent and effective algorithm for unsupervised learning. The aim of this research is not to show whether or how well sparse filtering works, but to understand why and when sparse filtering does work. We provide a thorough theoretical analysis of sparse filtering and its properties, and further offer an experimental validation of the main outcomes of our theoretical analysis. We show that sparse filtering works by explicitly maximizing the entropy of the learned representation through the maximization of the proxy of sparsity, and by implicitly preserving mutual information between original and learned representations through the constraint of preserving a structure of the data, specifically the structure defined by relations of neighborhoodness under the cosine distance. Furthermore, we empirically validate our theoretical results with artificial and real data sets, and we apply our theoretical understanding to explain the success of sparse filtering on real-world problems. Our work provides a strong theoretical basis for understanding sparse filtering: it highlights assumptions and conditions for success behind this feature distribution learning algorithm, and provides insights for developing new feature distribution learning algorithms

Download here

Covariate Shift Adaptation via Sparse Filtering for High-Dimensional Periodic Data

Fabio Massimo Zennaro, Ke Chen. Published in NIPS 2016 Workshop on Learning in High Dimensions with Structure, 2016

In this paper explores a use of sparse filtering algorithms applied to the problem of covariate shift adaptation. We suggest a novel algorithm, periodic sparse filtering, and we consider its application to structured high-dimensional data.

Download here

On the Use of Sparse Filtering for Covariate Shift Adaptation

Fabio Massimo Zennaro, Ke Chen. Published in arXiv, 2016

In this paper we formally analyse the use of sparse filtering algorithms to perform covariate shift adaptation. We provide a theoretical analysis of sparse filtering by evaluating the conditions required to perform covariate shift adaptation. We prove that sparse filtering can perform adaptation only if the conditional distribution of the labels has a structure explained by a cosine metric. To overcome this limitation, we propose a new algorithm, named periodic sparse filtering, and carry out the same theoretical analysis regarding covariate shift adaptation. We show that periodic sparse filtering can perform adaptation under the looser and more realistic requirement that the conditional distribution of the labels has a periodic structure, which may be satisfied, for instance, by user-dependent data sets. We experimentally validate our theoretical results on synthetic data. Moreover, we apply periodic sparse filtering to real-world data sets to demonstrate that this simple and computationally efficient algorithm is able to achieve competitive performances.

Download here

Pooling of Causal Models under Counterfactual Fairness via Causal Judgement Aggregation

Fabio Massimo Zennaro, Magdalena Ivanovska. Published in ICML 2018 Workshop on Machine Learning for Causal Inference, Counterfactual Prediction, and Autonomous Action, 2018

In this paper we consider the problem of combining multiple probabilistic causal models, provided by different experts, under the requirement that the aggregated model satisfy the criterion of counterfactual fairness. We build upon the work on causal models and fairness in machine learning, and we express the problem of combining multiple models within the framework of opinion pooling. We propose two simple algorithms, grounded in the theory of counterfactual fairness and causal judgment aggregation, that are guaranteed to generate aggregated probabilistic causal models respecting the criterion of fairness, and we compare their behaviors on a toy case study.

Download here

An Empirical Evaluation of the Approximation of Subjective Logic Operators Using Monte Carlo Simulations

Fabio Massimo Zennaro, Magdalena Ivanovska, Audun Jøsang. Published in International Journal of Approximate Reasoning, 2018

In this paper we analyze the use of subjective logic as a framework for performing approximate transformations over probability distribution functions. As for any approximation, we evaluate subjective logic in terms of computational efficiency and bias. However, while the computational cost may be easily estimated, the bias of subjective logic operators have not yet been investigated. In order to evaluate this bias, we propose an experimental protocol that exploits Monte Carlo simulations and their properties to assess the distance between the result produced by subjective logic operators and the true result of the corresponding transformation over probability distribution. This protocol allows a modeler to get an estimate of the degree of approximation she must be ready to accept as a trade-off for the computational efficiency and the interpretability of the subjective logic framework. Concretely, we apply our method to the relevant case study of the subjective logic operator for binomial multiplication and we study empirically its approximation.

Download here

Counterfactually Fair Prediction Using Multiple Causal Models

Fabio Massimo Zennaro, Magdalena Ivanovska. Published in 16th European Conference on Multi-Agent Systems (EUMAS), 2018

In this paper we study the problem of making predictions using multiple structural casual models defined by different agents, under the constraint that the prediction satisfies the criterion of counterfactual fairness. Relying on the frameworks of causality, fairness and opinion pooling, we build upon and extend previous work focusing on the qualitative aggregation of causal Bayesian networks and causal models. In order to complement previous qualitative results, we devise a method based on Monte Carlo simulations. This method enables a decision-maker to aggregate the outputs of the causal models provided by different experts while guaranteeing the counterfactual fairness of the result. We demonstrate our approach on a simple, yet illustrative, toy case study.

Download here

Analyzing and Storing Network Intrusion Detection Data using Bayesian Coresets: A Preliminary Study in Offline and Streaming Settings

Fabio Massimo Zennaro. Published in ECML 2019 Workshop on Machine Learning for CyberSecurity, 2019

In this paper we offer a preliminary study of the application of Bayesian coresets to network security data. Network intrusion detection is a field that could take advantage of Bayesian machine learning in modelling uncertainty and managing streaming data; however, the large size of the data sets often hinders the use of Bayesian learning methods based on MCMC. Limiting the amount of useful data is a central problem in a field like network traffic analysis, where large amount of redundant data can be generated very quickly via packet collection. Reducing the number of samples would not only make learning more feasible, but would also contribute to reduce the need for memory and storage. We explore here the use of Bayesian coresets, a technique that reduces the amount of data samples while guaranteeing the learning of an accurate posterior distribution using Bayesian learning. We analyze how Bayesian coresets affect the accuracy of learned models, and how time-space requirements are traded-off, both in a static scenario and in a streaming scenario.

Download here

Firearm Detection and Segmentation using an Ensemble of Semantic Neural Networks

Alexander Egiazarov, Vasileios Mavroeidis and Fabio Massimo Zennaro. Published in European Intelligence and Security Informatics Conference (EISIC) 2019, 2019

In recent years we have seen an upsurge in terror attacks around the world. Such attacks usually happen in public places with large crowds to cause the most damage possible and get the most attention. Even though surveillance cameras are assumed to be a powerful tool, their effect in preventing crime is far from clear due to either limitation in the ability of humans to vigilantly monitor video surveillance or for the simple reason that they are operating passively. In this paper, we present a weapon detection system based on an ensemble of semantic convolutional neural networks that decomposes the problem of detecting and locating a weapon into a set of smaller problems concerned with the individual component parts of a weapon. This approach has computational and practical advantages: a set of simpler neural networks dedicated to specific tasks requires less computational resources and can be trained in parallel; the overall output of the system given by the aggregation of the outputs of individual networks can be tuned by a user to trade-off false positives and false negatives; finally, according to ensemble theory, the output of the overall system will be robust and reliable even in the presence of weak individual models. We evaluated our system running simulations aimed at assessing the accuracy of individual networks and the whole system. The results on synthetic data and real-world data are promising, and they suggest that our approach may have advantages compared to the monolithic approach based on a single deep convolutional neural network.

Download here

Towards Further Understanding of Sparse Filtering via Information Bottleneck

Fabio Massimo Zennaro, Ke Chen. Published in arXiv, 2019

In this paper we examine a formalization of feature distribution learning (FDL) in information-theoretic terms relying on the analytical approach and on the tools already used in the study of the information bottleneck (IB). It has been conjectured that the behavior of FDL algorithms could be expressed as an optimization problem over two information-theoretic quantities: the mutual information of the data with the learned representations and the entropy of the learned distribution. In particular, such a formulation was offered in order to explain the success of the most prominent FDL algorithm, sparse filtering (SF). This conjecture was, however, left unproven. In this work, we aim at providing preliminary empirical support to this conjecture by performing experiments reminiscent of the work done on deep neural networks in the context of the IB research. Specifically, we borrow the idea of using information planes to analyze the behavior of the SF algorithm and gain insights on its dynamics. A confirmation of the conjecture about the dynamics of FDL may provide solid ground to develop information-theoretic tools to assess the quality of the learning process in FDL, and it may be extended to other unsupervised learning algorithms.

Download here

A Left Realist Critique of the Political Value of Adopting Machine Learning Systems in Criminal Justice

Fabio Massimo Zennaro. Published in ECML 2020 Workshop on Data Science for Social Good, 2020

In this paper we discuss the political value of the decision to adopt machine learning in the field of criminal justice. While a lively discussion in the community focuses on the issue of the social fairness of machine learning systems, we suggest that another relevant aspect of this debate concerns the political implications of the decision of using machine learning systems. Relying on the theory of Left realism, we argue that, from several points of view, modern supervised learning systems, broadly defined as functional learned systems for decision making, fit into an approach to crime that is close to the law and order stance. Far from offering a political judgment of value, the aim of the paper is to raise awareness about the potential implicit, and often overlooked, political assumptions and political values that may be undergirding a decision that is apparently purely technical.

Download here

Using Subjective Logic to Estimate Uncertainty in Multi-Armed Bandit Problems

Fabio Massimo Zennaro, Audun Jøsang. Published in ECML 2020 Workshop on Uncertainty in Machine Learning, 2020

The multi-armed bandit problem is a classical decision-making problem where an agent has to learn an optimal action balancing exploration and exploitation. Properly managing this trade-off requires a correct assessment of uncertainty; in multi-armed bandits, as in other machine learning applications, it is important to distinguish between stochasticity that is inherent to the system (aleatoric uncertainty) and stochasticity that derives from the limited knowledge of the agent (epistemic uncertainty). In this paper we consider the formalism of subjective logic, a concise and expressive framework to express Dirichlet-multinomial models as subjective opinions, and we apply it to the problem of multi-armed bandits. We propose new algorithms grounded in subjective logic to tackle the multi-armed bandit problem, we compare them against classical algorithms from the literature, and we analyze the insights they provide in evaluating the dynamics of uncertainty. Our preliminary results suggest that subjective logic quantities enable useful assessment of uncertainty that may be exploited by more refined agents.

Download here

The Agent Web Model - Modelling web hacking for reinforcement learning

Laszlo Erdodi, Fabio Massimo Zennaro. Published in International Journal of Information Security, 2020

Website hacking is a frequent attack type used by malicious actors to obtain confidential information, modify the integrity of web pages or make websites unavailable. The tools used by attackers are becoming more and more automated and sophisticated, and malicious machine learning agents seems to be the next development in this line. In order to provide ethical hackers with similar tools, and to understand the impact and the limitations of artificial agents, we present in this paper a model that formalizes web hacking tasks for reinforcement learning agents. Our model, named Agent Web Model, considers web hacking as a capture-the-flag style challenge, and it defines reinforcement learning problems at seven different levels of abstraction. We discuss the complexity of these problems in terms of actions and states an agent has to deal with, and we show that such a model allows to represent most of the relevant web vulnerabilities. Aware that the driver of advances in reinforcement learning is the availability of standardized challenges, we provide an implementation for the first three abstraction layers, in the hope that the community would consider these challenges in order to develop intelligent web hacking agents.

Download here

Firearm Detection via Convolutional Neural Networks: Comparing a Semantic Segmentation Model Against End-to-End Solutions

Alexander Egiazarov, Fabio Massimo Zennaro and Vasileios Mavroeidis. Published in IEEE Bigdata 2020 Workshop Cyberhunt, 2020

Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents such as terrorism, general criminal offences, or even domestic violence. One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis. In this paper we conduct a comparison between a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation. We evaluated both models from different points of view, including accuracy, computational and data complexity, flexibility and reliability. Our results show that a semantic segmentation model provides considerable amount of flexibility and resilience in the low data environment compared to classical deep model models, although its configuration and tuning presents a challenge in achieving the same levels of accuracy as an end-to-end model.

Download here

Stack-based Buffer Overflow Detection using Recurrent Neural Networks

William Arild Dahl, Laszlo Erdodi, Fabio Massimo Zennaro. Published in arXiv, 2020

Detecting vulnerabilities in software is a critical challenge in the development and deployment of applications. One of the most known and dangerous vulnerabilities is stack-based buffer overflows, which may allow potential attackers to execute malicious code. In this paper we consider the use of modern machine learning models, specifically recurrent neural networks, to detect stack-based buffer overflow vulnerabilities in the assembly code of a program. Since assembly code is a generic and common representation, focusing on this language allows us to potentially consider programs written in several different programming languages. Moreover, we subscribe to the hypothesis that code may be treated as natural language, and thus we process assembly code using standard architectures commonly employed in natural language processing. We perform a set of experiments aimed at confirming the validity of the natural language hypothesis and the feasibility of using recurrent neural networks for detecting vulnerabilities. Our results show that our architecture is able to capture subtle stack-based buffer overflow vulnerabilities that strongly depend on the context, thus suggesting that this approach may be extended to real-world setting, as well as to other forms of vulnerability detection.

Download here

A new decision making model based on Rank Centrality for GDM with fuzzy preference relations

Anis Yazidi, Magdalena Ivanovska, Fabio Massimo Zennaro, Pedro G. Lind, Enrique Herrera Viedma. Published in European Journal of Operational Research, 2021

Preference aggregation in Group Decision Making (GDM) is a substantial problem that has received a lot of research attention. Decision problems involving fuzzy preference relations constitute an important class within GDM. Legacy approaches dealing with the latter type of problems can be classified into indirect approaches, which involve deriving a group preference matrix as an intermediate step, and direct approaches, which deduce a group preference ranking based on individual preference rankings. Although the work on indirect approaches has been extensive in the literature, there is still a scarcity of research dealing with the direct approaches. In this paper we present a direct approach towards aggregating several fuzzy preference relations on a set of alternatives into a single weighted ranking of the alternatives. By mapping the pairwise preferences into transitions probabilities, we are able to derive a preference ranking from the stationary distribution of a stochastic matrix. Interestingly, the ranking of the alternatives obtained with our method corresponds to the optimizer of the Maximum Likelihood Estimation of a particular Bradley-Terry-Luce model. Furthermore, we perform a theoretical sensitivity analysis of the proposed method supported by experimental results and illustrate our approach towards GDM with a concrete numerical example. This work opens avenues for solving GDM problems using elements of probability theory, and thus, provides a sound theoretical fundament as well as plausible statistical interpretation for the aggregation of expert opinions in GDM.

Download here

Simulating SQL Injection Vulnerability Exploitation Using Q-Learning Reinforcement Learning Agents

Laszlo Erdodi, Åvald Åslaugson Sommervoll, Fabio Massimo Zennaro. Published in Journal of Information Security and Applications, 2021

In this paper, we propose a first formalization of the process of exploitation of SQL injection vulnerabilities. We consider a simplification of the dynamics of SQL injection attacks by casting this problem as a security capture-the-flag challenge. We model it as a Markov decision process, and we implement it as a reinforcement learning problem. We then deploy different reinforcement learning agents tasked with learning an effective policy to perform SQL injection; we design our training in such a way that the agent learns not just a specific strategy to solve an individual challenge but a more generic policy that may be applied to perform SQL injection attacks against any system instantiated randomly by our problem generator. We analyze the results in terms of the quality of the learned policy and in terms of convergence time as a function of the complexity of the challenge and the learning agent’s complexity. Our work fits in the wider research on the development of intelligent agents for autonomous penetration testing and white-hat hacking, and our results aim to contribute to understanding the potential and the limits of reinforcement learning in a security environment.

Download here

SQL Injections and Reinforcement Learning: An Empirical Evaluation of the Role of Action Structure

Manuel Del Verme, Åvald Åslaugson Sommervoll, Laszlo Erdodi, Simone Totaro, Fabio Massimo Zennaro. Published in Nordic Conference on Secure IT Systems (NordSec) 2021, 2021

Penetration testing is a central problem in computer security, and recently, the application of machine learning techniques to this topic has gathered momentum. In this paper, we consider the problem of exploiting SQL injection vulnerabilities, and we represent it as a capture-the-flag scenario in which an attacker can submit strings to an input form with the aim of obtaining a flag token representing private information. We then model the attacker as a reinforcement learning agent that interacts with the server to learn an optimal policy leading to an exploit. We compare two agents: a simpler structured agent that relies on significant a priori knowledge and uses high-level actions; and a structureless agent that has minimal a priori knowledge and generates SQL statements. The comparison showcases the feasibility of developing agents that rely on less ad-hoc modeling and illustrates a possible direction to develop agents that may have wide applicability.

Download here

Abstraction between Structural Causal Models: A Review of Definitions and Properties

Fabio Massimo Zennaro. Published in UAI 2022 Workshop on Causal Representation Learning [Best paper award], 2022

Structural causal models (SCMs) are a widespread formalism to deal with causal systems. A recent direction of research has considered the problem of relating formally SCMs at different levels of abstraction, by defining maps between SCMs and imposing a requirement of interventional consistency. This paper offers a review of the solutions proposed so far, focusing on the formal properties of a map between SCMs, and highlighting the different layers (structural, distributional) at which these properties may be enforced. This allows us to distinguish families of abstractions that may or may not be permitted by choosing to guarantee certain properties instead of others. Such an understanding not only allows to distinguish among proposal for causal abstraction with more awareness, but it also allows to tailor the definition of abstraction with respect to the forms of abstraction relevant to specific applications.

Download here

Towards Computing an Optimal Abstraction for Structural Causal Models

Fabio Massimo Zennaro, Paolo Turrini, Theodoros Damoulas. Published in UAI 2022 Workshop on Causal Representation Learning, 2022

Working with causal models at different levels of abstraction is an important feature of science. Existing work has already considered the problem of expressing formally the relation of abstraction between causal models. In this paper, we focus on the problem of learning abstractions. We start by defining the learning problem formally in terms of the optimization of a standard measure of consistency. We then point out the limitation of this approach, and we suggest extending the objective function with a term accounting for information loss. We suggest a concrete measure of information loss, and we illustrate its contribution to learning new abstractions.

Download here

Simulating All Archetypes Of SQL Injection Vulnerability Exploitation Using Reinforcement Learning Agents

Åvald Åslaugson Sommervoll, Laszlo Erdodi, Fabio Massimo Zennaro. Published in International Journal of Information Security, 2023

Vulnerabilities such as SQL injection represent a serious challenge to security. While tools with a pre-defined logic are commonly used in the field of penetration testing, the continually-evolving nature of the security challenge calls for models able to learn autonomously from experience. In this paper we build on previous results on the development of reinforcement learning models devised to exploit specific forms of SQL injection, and we design agents that are able to tackle a varied range of SQL injection vulnerabilities, virtually comprising all the archetypes normally considered by experts. We show that our agents, trained on a synthetic environment, perform a transfer of learning among the different SQL injections challenges; in particular, they learn to use their queries to efficiently gain knowledge about multiple vulnerabilities at once. We also introduce a novel and more versatile way to interpret server messages that reduces reliance on expert inputs. Our simulations show the feasibility of our approach which easily deals with a number of homogeneous challenges, as well as some of its limitations when presented with problems having higher degrees of uncertainty.

Modeling Penetration Testing with Reinforcement Learning Using Capture-the-Flag Challenges: Trade-offs between Model-free Learning and A Priori Knowledge

Fabio Massimo Zennaro, Laszlo Erdodi. Published in IET Information Security, 2023

Penetration testing is a security exercise aimed at assessing the security of a system by simulating attacks against it. So far, penetration testing has been carried out mainly by trained human attackers and its success critically depended on the available expertise. Automating this practice constitutes a non-trivial problem, as the range of actions that a human expert may attempts against a system and the range of knowledge she relies on to take her decisions are hard to capture. In this paper, we focus our attention on simplified penetration testing problems expressed in the form of capture the flag hacking challenges, and we apply reinforcement learning algorithms to try to solve them. In modeling these capture the flag competitions as reinforcement learning problems we highlight the specific challenges that characterize penetration testing. We observe these challenges experimentally across a set of varied simulations, and we study how different reinforcement learning techniques may help us addressing these challenges. In this way we show the feasibility of tackling penetration testing using reinforcement learning, and we highlight the challenges that must be taken into consideration, and possible directions to solve them.

Download here

Jointly Learning Consistent Causal Abstractions Over Multiple Interventional Distributions

Fabio Massimo Zennaro, Máté Drávucz, Geanina Apachitei, W. Dhammika Widanage, Theodoros Damoulas. Published in CLeaR 2023 (Causal Learning and Reasoning) [Oral presentation: 9% acceptance rate], 2023

An abstraction can be used to relate two structural causal models representing the same system at different levels of resolution. Learning abstractions which guarantee consistency with respect to interventional distributions would allow one to jointly reason about evidence across multiple levels of granularity while respecting the underlying cause-effect relationships. In this paper, we introduce a first framework for causal abstraction learning between SCMs based on the formalization of abstraction recently proposed by Rischel (2020). Based on that, we propose a differentiable programming solution that jointly solves a number of combinatorial sub-problems, and we study its performance and benefits against independent and sequential approaches on synthetic settings and on a challenging real-world problem related to electric vehicle battery manufacturing.

Download here

Quantifying Consistency and Information Loss for Causal Abstraction Learning

Fabio Massimo Zennaro, Paolo Turrini, Theodoros Damoulas. Published in IJCAI 2023 (International Joint Conference on Artificial Intelligence), 2023

Structural causal models provide a formalism to express causal relations between variables of interest. Models and variables can represent a system at different levels of abstraction, whereby variables and relations may be coarsened and refined according to the need of an agent or a modeller. However, to switch between different levels of abstraction requires evaluating the trade-off between the consistency and the information loss among models at different levels of abstraction. In this paper we introduce a family of interventional measures that an agent or a modeller may use to evaluate such a trade-off. We analyze the properties of these measures, and propose algorithms to evaluate and learn causal abstractions. Finally, we illustrate the flexibility of our setup by empirically showing how different measures and algorithmic choices may lead to different abstractions.

Download here

Interventionally Consistent Surrogates for Agent-based Simulators

Joel Dyer, Nicholas Bishop, Yorgos Felekis, Fabio Massimo Zennaro, Anisoara Calinescu, Theodoros Damoulas, Michael Wooldridge. Published in arXiv, 2023

Agent-based simulators provide granular representations of complex intelligent systems by directly modelling the interactions of the system’s constituent agents. Their high-fidelity nature enables hyper-local policy evaluation and testing of what-if scenarios, but is associated with large computational costs that inhibits their widespread use. Surrogate models can address these computational limitations, but they must behave consistently with the agent-based model under policy interventions of interest. In this paper, we capitalise on recent developments on causal abstractions to develop a framework for learning interventionally consistent surrogate models for agent-based simulators. Our proposed approach facilitates rapid experimentation with policy interventions in complex systems, while inducing surrogates to behave consistently with high probability with respect to the agent-based simulator across interventions of interest. We demonstrate with empirical studies that observationally trained surrogates can misjudge the effect of interventions and misguide policymakers towards suboptimal policies, while surrogates trained for interventional consistency with our proposed method closely mimic the behaviour of an agent-based model under interventions of interest.

Download here

Causal Optimal Transport of Abstractions

Yorgos Felekis, Fabio Massimo Zennaro, Nicola Branchini, Theodoros Damoulas. Published in CLeaR 2024 (Causal Learning and Reasoning), 2024

Causal abstraction (CA) theory establishes formal criteria for relating multiple structural causal models (SCMs) at different levels of granularity by defining maps between them. These maps have significant relevance for real-world challenges such as synthesizing causal evidence from multiple experimental environments, learning causally consistent representations at different resolutions, and linking interventions across multiple SCMs. In this work, we propose COTA, the first method to learn abstraction maps from observational and interventional data without assuming complete knowledge of the underlying SCMs. In particular, we introduce a multi-marginal Optimal Transport (OT) formulation that enforces do-calculus causal constraints, together with a cost function that relies on interventional information. We extensively evaluate COTA on synthetic and real world problems, and showcase its advantages over non-causal, independent and aggregated COTA formulations. Finally, we demonstrate the efficiency of our method as a data augmentation tool by comparing it against the state-of-the-art CA learning framework, which assumes fully specified SCMs, on a real-world downstream task.

Download here

Learning Consistent Causal Abstractions with Genetic Algorithms

Fabio Massimo Zennaro and Kai Olav Ellefsen. Published in NORA 2024 (Norwegian Artificial Intelligence Research Consortium Conference), 2024

Recent work in machine learning and artificial intelligence has dealt with bringing together two fundamental concepts underlying intelligence: causality and abstraction. Given two SCMs representing the same system at different levels of detail, learning an abstraction between them turned out to be a hard combinatorial problem, and solutions have been proposed relying on neural networks and optimal transport. In this paper, we consider the original problem of abstraction learning and we generalize it. We start highlighting a connection to the standard bin packing problem, and then we show that the problem at hand is actually an instance of a larger class of non-decomposable product of combinatorial problems. We propose an approach to solving these problems relying on genetic algorithms. Compared with a gradient descent approach, genetic algorithms do not require a relaxation of the problem and they can explore different regions of the solution space without getting stuck in sub-optimal minima. Furthermore, we will also show that the compositional form of the problem suits well the schemata hypothesis for genetic algorithm.

Download here

Causally Abstracted Multi-armed Bandits

Fabio Massimo Zennaro, Nicholas Bishop, Joel Dyer, Yorgos Felekis, Anisoara Calinescu, Michael Wooldridge, Theodoros Damoulas. Published in UAI 2024 (Conference on Uncertainty in Artificial Intelligence), 2024

Multi-armed bandits (MAB) and causal MABs (CMAB) are established frameworks for decision-making problems. The majority of prior work typically studies and solves individual MAB and CMAB in isolation for a given problem and associated data. However, decision-makers are often faced with multiple related problems and multi-scale observations where joint formulations are needed in order to efficiently exploit the problem structures and data dependencies. Transfer learning for CMABs addresses the situation where models are defined on identical variables, although causal connections may differ. In this work, we extend transfer learning to setups involving CMABs defined on potentially different variables, with varying degrees of granularity, and related via an abstraction map. Formally, we introduce the problem of causally abstracted MABs (CAMABs) by relying on the theory of causal abstraction in order to express a rigorous abstraction map. We propose algorithms to learn in a CAMAB, and study their regret. We illustrate the limitations and the strengths of our algorithms on a real-world scenario related to online advertising.

Download here

talks

Review of Sparse Filtering

Published:

Sparse filtering is an algorithm for unsupervised learning proposed in 2011. The authors introduced this algorithm as a paradigm of feature distribution learning, contrasting it with more traditional data distribution learning. In this seminar, we will explore the ideas behind sparse filtering following the original paper published in 2011. We will first discuss the general idea of feature distribution learning; then, we will present the specific algorithm for sparse filtering; finally, we will conclude with a discussion of the algorithm and a summary of further developments since the publication of the original paper.

Introduction to Information Theoretical Learning

Published:

This talk is meant to be a simple introduction to Principe’s framework for Information Theoretic Learning. We will first review a standard information theoretic measure, going through its derivation, its properties and its limitation. We will then derive a more general form of this information theoretic measure, and we will use it to compute statistical estimators. Finally, we will define an information theoretic loss function that can be used for learning.

Research Challenges for Applying Machine Learning in Cybersecurity

Published:

This talk provides an overview of some topics at the intersection of cybersecurity and machine learning with the aim of illustrating the possibilities offered by machine learning and surveying recent promising lines of research at the border between the two disciplines. The first part of the talk gives a brief introduction to machine learning from a conceptual point of view. The second part then explores research topics in three main domains: applications of machine learning to security; security aspects of machine learning; and, finally, safety concerns related to machine learning.

Overview of Adversarial Machine Learning and AI Safety

Published:

This talk provides an overview of the research in the fields of adversarial machine learning and AI safety. The first part of the talk gives a brief introduction to machine learning from a conceptual point of view; the second and the third part respectively illustrates some representative attacks and defenses for machine learning systems; and, finally, the last part lists safety concerns related to machine learning and artificial intelligence. (This presentation has some overlap with the previous talk “Research Challenges for Applying Machine Learning in Cybersecurity”)

Perspectives on AI/ML and Cybersecurity

Published:

This talk provides a short presentations of three perspectives to explore the intersection of cybersecurity and machine learning. It examines an instrumental perspective (in which ML is seen as a tool), a systemic perspective (in which ML is seen as component of a system to defend), and a societal perspective (in which ML is seen as a part of societal processes). Each perspective is reconnected with specific areas of research (cybersecurity, adversarial learning, AI safety).

A Gentle Introduction to Casual Models

Published:

n this presentation we are going to introduce causal model from the point of view of computer science, following the approach based on structural causal models proposed by Pearl. We will start by showing the place of causality theory and by discussing its relationship with standard statistics. We will then present graphical models (directed acyclic grpahs, Bayesian networks, causal Bayesian networks, and structural causal models) to address causal questions. We will then review some paradigmatic problems that arise in the field of causality, and how they can be solved.

Causal Models and Machine Learning

Published:

This talk aims at providing an overall understanding of the role of causal modelling, and its relationship to machine learning. We are going to introduce casual models following the popular approach based on structural causal models proposed by Pearl, and show how they can capture the notion of causal relations. We will consider paradigmatic casual problems (causal inference and causal discovery) and discuss how they can be tackled. Finally, we will briefly explore connections between causality and machine learning, touching on topics such as learning with causal assumptions, using counterfactuals to assess fairness, and expressing reinforcement learning problems in causal terms.

Modelling Capture-the-Flag Challenges Using Reinforcement Learning

Published:

These slides present the research project of modelling hacking in the form of capture-the-flag (CTF) games as problems solvable by agents trained by reinforcement learning (RL). The main assumptions and challengese are presented, along with some preliminary results.

Information Bottleneck (and Unsupervised Learning)

Published:

This short presentation introduces the method of information bottleneck by describing its formulation and by illustrating its application in analyzing the behaviour of deep neural networks. The presentation ends discussing the problem of using a similar information-theoretic method to study the behaviour of unsupervised learning algorithms, focusing in particular on the analysis of the sparse filtering algorithm.

Neural Networks, Information Bottleneck and Unsupervised Learning

Published:

These slides provides a quick conceptual introduction to neural networks for supervised learning, and review some hypothesis and theories meant to explain the generalization performance of learning. The presentation then focuses on one of these possible interpretative frameworks, information bottleneck, and discusses its possible application to understand the dynamics of unsupervised learning algorithms, such as sparse filtering.

The (new) attack surfaces of data-learned models: Adversarial attacks and defenses for ML models

Published:

These slides provide an overview on the topic of the security of machine learning systems. We identify the two main attack surfaces inherent in machine learned systems, and we then provide a review of the main attack and defenses, heavily relying on analogical reasoning to illustrate and explain these methods. The presentation ends with remarks on the practical implications of these vulnerabilities and the current directions of research.

Abstracting Causal Models

Published:

Structural causal models (SCMs) constitute a rigorous and tested formalism to deal with causality in many fields, including artificial intelligence and machine learning. Systems and phenomena may be modelled as SCMs and then studied using the tools provided by the framework of causality. A given system can, however, be modelled at different levels of abstraction, depending on the aims or the resources of a modeller. The most exemplar case is probably statistical physics, where a thermodynamical system may be represented both as a collection of microscopic particles or as a single body with macroscopic properties. In general, however, switching between models with different granularities presents non-trivial challenges and raises questions of consistency. These slides will first provide a brief introduction to SCMs, and then consider how we can express the problem of relating SCMs representing the same phenomenon at different levels of abstraction. Finally, we will discuss open challenges and present some existing solutions, as well as pointing towards possible future directions of research.

Abstracting Causal Models (short)

Published:

These slides provide a synthetic overview of the problem of relating structural causal models (SCMs) at different levels of abstraction. We define the problem and discuss the desiderata of our solution. We present a few of the existing formalizations and solutions offered in the literature. We then conclude highglighting interesting future direction of research in this area.

Applications of reinforcement learning to computer security: problems, models, and perspectives

Published:

These slides analyze the application of reinforcement learning for modelling the problem of penetration testing in computer security. After a conceptual overview of reinforcement learning, we discuss which are the specific challenges in modeling penetration testing as a game that may be solved by a reinforcement learning agent. Finally, we present some of the work done by the research group at the University of Oslo on this topic, including conceptual modelling and preliminary practical implementations of reinforcement learning environments and agents.

Abstracting Causal Structural Models

Published:

In this presentation we consider the problem of relating causal models representing the same phenomenon or system at different levels of abstraction. A given system may be represented with more or less details according to the resources or the need of a modeler; switching between descriptions at different levels of abstraction is not trivial, and it raises questions of consistency. In this presentation, we will focus in particular on structural causal models (SCM) and we will express properties of consistency in this context. We will then present two formalisms for defining a relation of abstraction between SCMs: an approach based on the definition of a transformation between the outcomes of models, and an approach based on the definition of a mapping between the structure of models. We will then conclude with some observations and some questions regarding this current direction of research.

Abstraction between SCMs: A Review of Definitions and Properties

Published:

In this presentation we first offer a review of definitions of abstractions proposed in the literature, and then we propose a framework to align these definitions and evaluate their properties. We suggest analyzing abstractions on two layers (a structural layer and a distributional layer) and we review some basic properties that may be enforced on maps defined on each layer. We suggest that this framework may contribute to a better understanding of different forms of abstraction, as well as providing a way to tailor application-specific definitions of abstraction.

Abstraction of Causal Structural Models

Published:

Causal models can represent an identical system or phenomenon at different levels of abstraction. In this talk, we will focus on structural causal models (SCM) and review two frameworks which have been proposed to express a relation of abstraction between SCMs and to measure the interventional consistency of an abstraction. We will then discuss some current directions of research, including the problem of learning abstractions.

Jointly Learning Consistent Causal Abstractions Over Multiple Interventional Distributions

Published:

In this presentation we review the definition of abstraction between structural causal models and we frame the problem of learning a mapping between them. We discuss the challenges of learning a causal abstraction that minimizes the abstraction error in terms of interventional consistency. We then suggest an approach based on a relaxation and parametrization of the problem, leading to a solution based on differentiable programming. The solution approach is evaluated both on synthetic and real-world data.

Introduction to Causality: Structural Causal Modelling

Published:

In this talk we will introduce one of the most important formalisms to represent causal systems in computer science. We will start with a brief review of causality, highlighting the meaning of causal queries and the limitations of standard statistics and machine learning in answering them. To address these shortcomings, we will present the formalism of structural causal models (SCMs). We will then show how these models can be used to rigorously answer different types of causal questions, including observational, interventional and counterfactual questions. Finally, we will conclude by discussing how this formalization gives rise to a rich theory of causality, and how the ideas underlying causality have strong and promising intersections with artificial intelligence and machine learning.

Research in Machine Learning at UiB

Published:

In this talk we give a quick overview of the Machine Learning group at UiB and of its research interests and directions.

Structural causal models and abstraction for modelling battery manufacturing

Published:

Modelling complex systems and processes, such as battery manufacturing, is a significant scientific and technical challenge. Mathematics, statistics and machine learning provide useful tools to tackle this problem. In this talk, we will focus on the recent formalism of structural causal models (SCM) and casual abstractions (CA). We will first offer a high-level introduction to SCMs and CA, discussing in particular their importance and relevance for modelling. We will then make a reference to our original methodology for learning CAs. Finally, we will showcase our preliminary results on the problem of modelling one stage of the lithium-ion battery manufacturing process, demonstrating the potential for integrating data collected by different research groups.

Quantifying Consistency and Information Loss for Causal Abstraction Learning

Published:

In this presentation we quickly review the idea of defining an abstraction between structural causal models and we present the standard measure of abstraction error proposed in the literature. We then consider some potential limitations when using this single measure to assess or learn abstractions. To overcome this limit, we propose an extension of the original definition of abstraction approximation, we derive new measures of abstraction error, and we discuss theoretical and applied properties of these new measures.

Abstraction between Structural Causal Models and Measures of Abstraction Error

Published:

In this talk we discuss how rigorous relations between causal models may be defined and quantitatively evaluated. We will start with a quick introduction to the popular formalism of structural causal models. Next, we will review alternative proposals for expressing relations of abstractions between these models. We will then focus on one particular framework, and show how a notion of abstraction error can be introduced in this setup. Finally, we will discuss some of the limitations of this measure, and how alternative measures of error may be developed in order to capture different aspects of abstraction and fit different aims. We will conclude with a few considerations about possible future developments of this theory of abstraction.

Learning Causal Abstractions

Published:

In this presentation we review the definition of structural causal models and we introduce the problem of relating these models via an abstraction map. We formalize the problem of learning such a causal abstraction map as a minimizer of an abstraction error expressed in terms of interventional consistency, and we discuss some of the challenges involved in this optimization problem. We then present an approach based on a relaxation and parametrization of the problem, leading to a solution based on differentiable programming. The solution approach is evaluated both on synthetic and real-world data.

Learning Causal Abstractions

Published:

In this presentation we review the definition of structural causal models and we introduce the problem of relating these models via an abstraction map. We formalize the problem of learning such a causal abstraction map as a minimizer of an abstraction error expressed in terms of interventional consistency, and we discuss some of the challenges involved in this optimization problem. We then present an approach based on a relaxation and parametrization of the problem, leading to a solution based on differentiable programming. The solution approach is evaluated both on synthetic and real-world data.

teaching

Digital Biology

Undergraduate course, University of Manchester, 2015

Teaching assistance in Digital Biology

Introduction to Machine Learning with Python

Undergraduate and graduate course, OsloMet University, 2020

Guest lecturer for a class providing a brief introduction on the key concepts of machine learning and a short tutorial on using scikit-learn to instantiate models and train them.

Reinforcement Learning

Graduate course, University of Bergen, 2024

Lecturer for the course INF368A Selected Topics in Machine Learning. The course covers the foundation of Reinforcement Learning, from modelling with Markov Decision Processes to applications of Deep Reinforcement Learning.