Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
This is a page not in th emain menu
Published:
In previous posts we have discussed theoretical and conceptual properties of abstractions between causal models, while at the same time implementing code for checking our statements and for running simulations.
Published:
In previous posts we have discussed abstractions between causal models; we have evaluated the abstraction error between a micromodel to a macromodel with respect to a property of interventional consistency.
Published:
Causality and causal inference deal with expressing and reasoning about relationships of cause and effects, and structural causal model provide a rigorous formalism to assess causality.
Published:
In previous posts we have seen how to define abstractions between causal models, and how to compute automatically the abstraction error when switching from a micromodel to a macromodel.
Published:
In previous posts we have explored how causal models may be related to each other at different levels of abstraction using the framework proposed by Rischel to assess their consistency and evaluate the error that may be introduced by abstraction.
Published:
In previous posts we have analyzed how causal models may be related to each other at different levels of abstraction relying on the framework proposed by Rischel and grounded in category theory.
Published:
In a previous post we have discussed how causal models may be related to each other at different levels of abstraction. In particular, following the work of Rischel, we reviewed how we can precisely define abstraction, model it using category theory, and evaluate abstraction error.
Published:
Causal models offer a rigorous formalism to express causal relations between variables of interest. Causal systems may be represented at different levels of granularity or abstraction; think, for example, to microscopic and macroscopic descriptions of thermodynamics systems. Reasoning about the relationship between causal models at different levels of abstraction is a non trivial problem.
Published:
Sparse filtering is a recently-developed unsupervised feature distribution lerning algorithm with interesting properties. Basic implementations are available in Matlab, and python relying on the numpy library. In this post we explain and analyze a re-implementation of the python code in order to work within the tensorflow framework.
Published:
Differentiable programming (also known as software 2.0) offers a novel approach to coding, focused on defining parametrized differentiable model to solve a problem instead of coding a precise algorithm. In this post we explore the use of this coding paradigm to solve the problem of consensus reaching in group-decision making.
Published:
In this post we consider again the problem of performing causal analysis on covid19 data, specifically the question How is the implementation of existing strategies affecting the rates of COVID-19 infection?.
Published:
In this post we present a mock setup for performing causal analyses on covid19 data using the dowhy library for causal analysis.
Published:
In this post we present a game developed with my colleague Åvald and submitted to the IBM Quantum Game Challenge. This project exploit the integration of IBM qiskit, a library developed to design, run and simulate quantum circuits, and OpenAI gym, a library developed to define, train and run reinforcement learning agents.
Published:
In this post we keep exploring the integration of IBM qiskit, a library developed to design, run and simulate quantum circuits, and OpenAI gym, a library developed to define, train and run reinforcement learning agents.
Published:
In this post we explore the integration of IBM qiskit, a library developed to design, run and simulate quantum circuits, and OpenAI gym, a library developed to define, train and run reinforcement learning agents.
Published:
In a previous posts we have explored the use of Bayesian coreset on synthetic data and its application to a phishing data set. We replicated experiments from the original article by integrating the original code with the Edward framework.
Published:
Causal inference tackles the problem of dealing with causal statements. A rigorous statistical formalism to assess causality has been proposed by Pearl.
Published:
In a previous post we have explored the use of Bayesian coreset on synthetic data and we have integrated the original code with Edward in order to exploit the features offered by probabilistic programming.
Published:
Modern datasets often contain a large number of redundant samples, making the storing of data and the learning of models expensive. Coreset computation is an approach to reduce the amount of samples by selecting (and weighting) informative samples and discarding redundant ones.
Published:
This class offers an introduction to the topic of unsupervised learning. It first contrasts the problem of unsupervised learning to the problem of supervised learning. It presents some conceptual tools (representations, structure) to formalize and reason about the unsupervised learning problem. These concepts are then used to discuss several prototypical unsupervised learning tasks, from clustering to generative modelling. Finally three paradigmatic unsupervised learning algorithms (PCA, k-means, and autoencoders) are analyzed and evaluated.
Published:
These slides are a contribution to a class on future developments of artificial intelligence and machine learning. The first set of slides introduces the problem of fair decision-making in machine learning systems, and illustrates some sample solutions underlining ideas and flaws. The second set of slides discusses the problem of assessing causal relations in data, and points to solutions in the direction of causal graphical models.
Published:
This class offers an introduction to machine learning using python. After a brief review of the main concepts in machine learning, the class discusses the main python environment to run and train machine learning models. Examples of basic applications of machine learning models (linear regression, logistic regression, k-means, classification trees, neural networks) are then illustrated using sckikit-learn. Finally, references for further learning are provided.
Published:
Proof (in Italian).
Published:
BSc Dissertation (in Italian).
Published:
MSc coursework essay.
Published:
MSc coursework essay (in Italian).
Published:
MSc Dissertation (in Italian).
Published:
MSc coursework essay.
Published:
MSc Dissertation.
Published:
Chapters in PhD dissertation.
Published:
PhD Dissertation.
Short description of portfolio item number 1
Short description of portfolio item number 2
Fabio Massimo Zennaro, Ke Chen. Published in Neural Networks, 2016
In this paper we present a theoretical analysis to understand sparse filtering, a recent and effective algorithm for unsupervised learning. The aim of this research is not to show whether or how well sparse filtering works, but to understand why and when sparse filtering does work. We provide a thorough theoretical analysis of sparse filtering and its properties, and further offer an experimental validation of the main outcomes of our theoretical analysis. We show that sparse filtering works by explicitly maximizing the entropy of the learned representation through the maximization of the proxy of sparsity, and by implicitly preserving mutual information between original and learned representations through the constraint of preserving a structure of the data, specifically the structure defined by relations of neighborhoodness under the cosine distance. Furthermore, we empirically validate our theoretical results with artificial and real data sets, and we apply our theoretical understanding to explain the success of sparse filtering on real-world problems. Our work provides a strong theoretical basis for understanding sparse filtering: it highlights assumptions and conditions for success behind this feature distribution learning algorithm, and provides insights for developing new feature distribution learning algorithms
Download here
Fabio Massimo Zennaro, Ke Chen. Published in NIPS 2016 Workshop on Learning in High Dimensions with Structure, 2016
In this paper explores a use of sparse filtering algorithms applied to the problem of covariate shift adaptation. We suggest a novel algorithm, periodic sparse filtering, and we consider its application to structured high-dimensional data.
Download here
Fabio Massimo Zennaro, Ke Chen. Published in arXiv, 2016
In this paper we formally analyse the use of sparse filtering algorithms to perform covariate shift adaptation. We provide a theoretical analysis of sparse filtering by evaluating the conditions required to perform covariate shift adaptation. We prove that sparse filtering can perform adaptation only if the conditional distribution of the labels has a structure explained by a cosine metric. To overcome this limitation, we propose a new algorithm, named periodic sparse filtering, and carry out the same theoretical analysis regarding covariate shift adaptation. We show that periodic sparse filtering can perform adaptation under the looser and more realistic requirement that the conditional distribution of the labels has a periodic structure, which may be satisfied, for instance, by user-dependent data sets. We experimentally validate our theoretical results on synthetic data. Moreover, we apply periodic sparse filtering to real-world data sets to demonstrate that this simple and computationally efficient algorithm is able to achieve competitive performances.
Download here
Fabio Massimo Zennaro, Magdalena Ivanovska. Published in ICML 2018 Workshop on Machine Learning for Causal Inference, Counterfactual Prediction, and Autonomous Action, 2018
In this paper we consider the problem of combining multiple probabilistic causal models, provided by different experts, under the requirement that the aggregated model satisfy the criterion of counterfactual fairness. We build upon the work on causal models and fairness in machine learning, and we express the problem of combining multiple models within the framework of opinion pooling. We propose two simple algorithms, grounded in the theory of counterfactual fairness and causal judgment aggregation, that are guaranteed to generate aggregated probabilistic causal models respecting the criterion of fairness, and we compare their behaviors on a toy case study.
Download here
Fabio Massimo Zennaro, Magdalena Ivanovska, Audun Jøsang. Published in International Journal of Approximate Reasoning, 2018
In this paper we analyze the use of subjective logic as a framework for performing approximate transformations over probability distribution functions. As for any approximation, we evaluate subjective logic in terms of computational efficiency and bias. However, while the computational cost may be easily estimated, the bias of subjective logic operators have not yet been investigated. In order to evaluate this bias, we propose an experimental protocol that exploits Monte Carlo simulations and their properties to assess the distance between the result produced by subjective logic operators and the true result of the corresponding transformation over probability distribution. This protocol allows a modeler to get an estimate of the degree of approximation she must be ready to accept as a trade-off for the computational efficiency and the interpretability of the subjective logic framework. Concretely, we apply our method to the relevant case study of the subjective logic operator for binomial multiplication and we study empirically its approximation.
Download here
Fabio Massimo Zennaro, Magdalena Ivanovska. Published in 16th European Conference on Multi-Agent Systems (EUMAS), 2018
In this paper we study the problem of making predictions using multiple structural casual models defined by different agents, under the constraint that the prediction satisfies the criterion of counterfactual fairness. Relying on the frameworks of causality, fairness and opinion pooling, we build upon and extend previous work focusing on the qualitative aggregation of causal Bayesian networks and causal models. In order to complement previous qualitative results, we devise a method based on Monte Carlo simulations. This method enables a decision-maker to aggregate the outputs of the causal models provided by different experts while guaranteeing the counterfactual fairness of the result. We demonstrate our approach on a simple, yet illustrative, toy case study.
Download here
Fabio Massimo Zennaro. Published in ECML 2019 Workshop on Machine Learning for CyberSecurity, 2019
In this paper we offer a preliminary study of the application of Bayesian coresets to network security data. Network intrusion detection is a field that could take advantage of Bayesian machine learning in modelling uncertainty and managing streaming data; however, the large size of the data sets often hinders the use of Bayesian learning methods based on MCMC. Limiting the amount of useful data is a central problem in a field like network traffic analysis, where large amount of redundant data can be generated very quickly via packet collection. Reducing the number of samples would not only make learning more feasible, but would also contribute to reduce the need for memory and storage. We explore here the use of Bayesian coresets, a technique that reduces the amount of data samples while guaranteeing the learning of an accurate posterior distribution using Bayesian learning. We analyze how Bayesian coresets affect the accuracy of learned models, and how time-space requirements are traded-off, both in a static scenario and in a streaming scenario.
Download here
Alexander Egiazarov, Vasileios Mavroeidis and Fabio Massimo Zennaro. Published in European Intelligence and Security Informatics Conference (EISIC) 2019, 2019
In recent years we have seen an upsurge in terror attacks around the world. Such attacks usually happen in public places with large crowds to cause the most damage possible and get the most attention. Even though surveillance cameras are assumed to be a powerful tool, their effect in preventing crime is far from clear due to either limitation in the ability of humans to vigilantly monitor video surveillance or for the simple reason that they are operating passively. In this paper, we present a weapon detection system based on an ensemble of semantic convolutional neural networks that decomposes the problem of detecting and locating a weapon into a set of smaller problems concerned with the individual component parts of a weapon. This approach has computational and practical advantages: a set of simpler neural networks dedicated to specific tasks requires less computational resources and can be trained in parallel; the overall output of the system given by the aggregation of the outputs of individual networks can be tuned by a user to trade-off false positives and false negatives; finally, according to ensemble theory, the output of the overall system will be robust and reliable even in the presence of weak individual models. We evaluated our system running simulations aimed at assessing the accuracy of individual networks and the whole system. The results on synthetic data and real-world data are promising, and they suggest that our approach may have advantages compared to the monolithic approach based on a single deep convolutional neural network.
Download here
Fabio Massimo Zennaro, Ke Chen. Published in arXiv, 2019
In this paper we examine a formalization of feature distribution learning (FDL) in information-theoretic terms relying on the analytical approach and on the tools already used in the study of the information bottleneck (IB). It has been conjectured that the behavior of FDL algorithms could be expressed as an optimization problem over two information-theoretic quantities: the mutual information of the data with the learned representations and the entropy of the learned distribution. In particular, such a formulation was offered in order to explain the success of the most prominent FDL algorithm, sparse filtering (SF). This conjecture was, however, left unproven. In this work, we aim at providing preliminary empirical support to this conjecture by performing experiments reminiscent of the work done on deep neural networks in the context of the IB research. Specifically, we borrow the idea of using information planes to analyze the behavior of the SF algorithm and gain insights on its dynamics. A confirmation of the conjecture about the dynamics of FDL may provide solid ground to develop information-theoretic tools to assess the quality of the learning process in FDL, and it may be extended to other unsupervised learning algorithms.
Download here
Fabio Massimo Zennaro. Published in ECML 2020 Workshop on Data Science for Social Good, 2020
In this paper we discuss the political value of the decision to adopt machine learning in the field of criminal justice. While a lively discussion in the community focuses on the issue of the social fairness of machine learning systems, we suggest that another relevant aspect of this debate concerns the political implications of the decision of using machine learning systems. Relying on the theory of Left realism, we argue that, from several points of view, modern supervised learning systems, broadly defined as functional learned systems for decision making, fit into an approach to crime that is close to the law and order stance. Far from offering a political judgment of value, the aim of the paper is to raise awareness about the potential implicit, and often overlooked, political assumptions and political values that may be undergirding a decision that is apparently purely technical.
Download here
Fabio Massimo Zennaro, Audun Jøsang. Published in ECML 2020 Workshop on Uncertainty in Machine Learning, 2020
The multi-armed bandit problem is a classical decision-making problem where an agent has to learn an optimal action balancing exploration and exploitation. Properly managing this trade-off requires a correct assessment of uncertainty; in multi-armed bandits, as in other machine learning applications, it is important to distinguish between stochasticity that is inherent to the system (aleatoric uncertainty) and stochasticity that derives from the limited knowledge of the agent (epistemic uncertainty). In this paper we consider the formalism of subjective logic, a concise and expressive framework to express Dirichlet-multinomial models as subjective opinions, and we apply it to the problem of multi-armed bandits. We propose new algorithms grounded in subjective logic to tackle the multi-armed bandit problem, we compare them against classical algorithms from the literature, and we analyze the insights they provide in evaluating the dynamics of uncertainty. Our preliminary results suggest that subjective logic quantities enable useful assessment of uncertainty that may be exploited by more refined agents.
Download here
Laszlo Erdodi, Fabio Massimo Zennaro. Published in International Journal of Information Security, 2020
Website hacking is a frequent attack type used by malicious actors to obtain confidential information, modify the integrity of web pages or make websites unavailable. The tools used by attackers are becoming more and more automated and sophisticated, and malicious machine learning agents seems to be the next development in this line. In order to provide ethical hackers with similar tools, and to understand the impact and the limitations of artificial agents, we present in this paper a model that formalizes web hacking tasks for reinforcement learning agents. Our model, named Agent Web Model, considers web hacking as a capture-the-flag style challenge, and it defines reinforcement learning problems at seven different levels of abstraction. We discuss the complexity of these problems in terms of actions and states an agent has to deal with, and we show that such a model allows to represent most of the relevant web vulnerabilities. Aware that the driver of advances in reinforcement learning is the availability of standardized challenges, we provide an implementation for the first three abstraction layers, in the hope that the community would consider these challenges in order to develop intelligent web hacking agents.
Download here
Alexander Egiazarov, Fabio Massimo Zennaro and Vasileios Mavroeidis. Published in IEEE Bigdata 2020 Workshop Cyberhunt, 2020
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents such as terrorism, general criminal offences, or even domestic violence. One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis. In this paper we conduct a comparison between a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation. We evaluated both models from different points of view, including accuracy, computational and data complexity, flexibility and reliability. Our results show that a semantic segmentation model provides considerable amount of flexibility and resilience in the low data environment compared to classical deep model models, although its configuration and tuning presents a challenge in achieving the same levels of accuracy as an end-to-end model.
Download here
William Arild Dahl, Laszlo Erdodi, Fabio Massimo Zennaro. Published in arXiv, 2020
Detecting vulnerabilities in software is a critical challenge in the development and deployment of applications. One of the most known and dangerous vulnerabilities is stack-based buffer overflows, which may allow potential attackers to execute malicious code. In this paper we consider the use of modern machine learning models, specifically recurrent neural networks, to detect stack-based buffer overflow vulnerabilities in the assembly code of a program. Since assembly code is a generic and common representation, focusing on this language allows us to potentially consider programs written in several different programming languages. Moreover, we subscribe to the hypothesis that code may be treated as natural language, and thus we process assembly code using standard architectures commonly employed in natural language processing. We perform a set of experiments aimed at confirming the validity of the natural language hypothesis and the feasibility of using recurrent neural networks for detecting vulnerabilities. Our results show that our architecture is able to capture subtle stack-based buffer overflow vulnerabilities that strongly depend on the context, thus suggesting that this approach may be extended to real-world setting, as well as to other forms of vulnerability detection.
Download here
Anis Yazidi, Magdalena Ivanovska, Fabio Massimo Zennaro, Pedro G. Lind, Enrique Herrera Viedma. Published in European Journal of Operational Research, 2021
Preference aggregation in Group Decision Making (GDM) is a substantial problem that has received a lot of research attention. Decision problems involving fuzzy preference relations constitute an important class within GDM. Legacy approaches dealing with the latter type of problems can be classified into indirect approaches, which involve deriving a group preference matrix as an intermediate step, and direct approaches, which deduce a group preference ranking based on individual preference rankings. Although the work on indirect approaches has been extensive in the literature, there is still a scarcity of research dealing with the direct approaches. In this paper we present a direct approach towards aggregating several fuzzy preference relations on a set of alternatives into a single weighted ranking of the alternatives. By mapping the pairwise preferences into transitions probabilities, we are able to derive a preference ranking from the stationary distribution of a stochastic matrix. Interestingly, the ranking of the alternatives obtained with our method corresponds to the optimizer of the Maximum Likelihood Estimation of a particular Bradley-Terry-Luce model. Furthermore, we perform a theoretical sensitivity analysis of the proposed method supported by experimental results and illustrate our approach towards GDM with a concrete numerical example. This work opens avenues for solving GDM problems using elements of probability theory, and thus, provides a sound theoretical fundament as well as plausible statistical interpretation for the aggregation of expert opinions in GDM.
Download here
Laszlo Erdodi, Åvald Åslaugson Sommervoll, Fabio Massimo Zennaro. Published in Journal of Information Security and Applications, 2021
In this paper, we propose a first formalization of the process of exploitation of SQL injection vulnerabilities. We consider a simplification of the dynamics of SQL injection attacks by casting this problem as a security capture-the-flag challenge. We model it as a Markov decision process, and we implement it as a reinforcement learning problem. We then deploy different reinforcement learning agents tasked with learning an effective policy to perform SQL injection; we design our training in such a way that the agent learns not just a specific strategy to solve an individual challenge but a more generic policy that may be applied to perform SQL injection attacks against any system instantiated randomly by our problem generator. We analyze the results in terms of the quality of the learned policy and in terms of convergence time as a function of the complexity of the challenge and the learning agent’s complexity. Our work fits in the wider research on the development of intelligent agents for autonomous penetration testing and white-hat hacking, and our results aim to contribute to understanding the potential and the limits of reinforcement learning in a security environment.
Download here
Manuel Del Verme, Åvald Åslaugson Sommervoll, Laszlo Erdodi, Simone Totaro, Fabio Massimo Zennaro. Published in Nordic Conference on Secure IT Systems (NordSec) 2021, 2021
Penetration testing is a central problem in computer security, and recently, the application of machine learning techniques to this topic has gathered momentum. In this paper, we consider the problem of exploiting SQL injection vulnerabilities, and we represent it as a capture-the-flag scenario in which an attacker can submit strings to an input form with the aim of obtaining a flag token representing private information. We then model the attacker as a reinforcement learning agent that interacts with the server to learn an optimal policy leading to an exploit. We compare two agents: a simpler structured agent that relies on significant a priori knowledge and uses high-level actions; and a structureless agent that has minimal a priori knowledge and generates SQL statements. The comparison showcases the feasibility of developing agents that rely on less ad-hoc modeling and illustrates a possible direction to develop agents that may have wide applicability.
Download here
Fabio Massimo Zennaro. Published in UAI 2022 Workshop on Causal Representation Learning [Best paper award], 2022
Structural causal models (SCMs) are a widespread formalism to deal with causal systems. A recent direction of research has considered the problem of relating formally SCMs at different levels of abstraction, by defining maps between SCMs and imposing a requirement of interventional consistency. This paper offers a review of the solutions proposed so far, focusing on the formal properties of a map between SCMs, and highlighting the different layers (structural, distributional) at which these properties may be enforced. This allows us to distinguish families of abstractions that may or may not be permitted by choosing to guarantee certain properties instead of others. Such an understanding not only allows to distinguish among proposal for causal abstraction with more awareness, but it also allows to tailor the definition of abstraction with respect to the forms of abstraction relevant to specific applications.
Download here
Fabio Massimo Zennaro, Paolo Turrini, Theodoros Damoulas. Published in UAI 2022 Workshop on Causal Representation Learning, 2022
Working with causal models at different levels of abstraction is an important feature of science. Existing work has already considered the problem of expressing formally the relation of abstraction between causal models. In this paper, we focus on the problem of learning abstractions. We start by defining the learning problem formally in terms of the optimization of a standard measure of consistency. We then point out the limitation of this approach, and we suggest extending the objective function with a term accounting for information loss. We suggest a concrete measure of information loss, and we illustrate its contribution to learning new abstractions.
Download here
Åvald Åslaugson Sommervoll, Laszlo Erdodi, Fabio Massimo Zennaro. Published in International Journal of Information Security, 2023
Vulnerabilities such as SQL injection represent a serious challenge to security. While tools with a pre-defined logic are commonly used in the field of penetration testing, the continually-evolving nature of the security challenge calls for models able to learn autonomously from experience. In this paper we build on previous results on the development of reinforcement learning models devised to exploit specific forms of SQL injection, and we design agents that are able to tackle a varied range of SQL injection vulnerabilities, virtually comprising all the archetypes normally considered by experts. We show that our agents, trained on a synthetic environment, perform a transfer of learning among the different SQL injections challenges; in particular, they learn to use their queries to efficiently gain knowledge about multiple vulnerabilities at once. We also introduce a novel and more versatile way to interpret server messages that reduces reliance on expert inputs. Our simulations show the feasibility of our approach which easily deals with a number of homogeneous challenges, as well as some of its limitations when presented with problems having higher degrees of uncertainty.
Fabio Massimo Zennaro, Laszlo Erdodi. Published in IET Information Security, 2023
Penetration testing is a security exercise aimed at assessing the security of a system by simulating attacks against it. So far, penetration testing has been carried out mainly by trained human attackers and its success critically depended on the available expertise. Automating this practice constitutes a non-trivial problem, as the range of actions that a human expert may attempts against a system and the range of knowledge she relies on to take her decisions are hard to capture. In this paper, we focus our attention on simplified penetration testing problems expressed in the form of capture the flag hacking challenges, and we apply reinforcement learning algorithms to try to solve them. In modeling these capture the flag competitions as reinforcement learning problems we highlight the specific challenges that characterize penetration testing. We observe these challenges experimentally across a set of varied simulations, and we study how different reinforcement learning techniques may help us addressing these challenges. In this way we show the feasibility of tackling penetration testing using reinforcement learning, and we highlight the challenges that must be taken into consideration, and possible directions to solve them.
Download here
Fabio Massimo Zennaro, Máté Drávucz, Geanina Apachitei, W. Dhammika Widanage, Theodoros Damoulas. Published in CLeaR 2023 (Causal Learning and Reasoning) [Oral presentation: 9% acceptance rate], 2023
An abstraction can be used to relate two structural causal models representing the same system at different levels of resolution. Learning abstractions which guarantee consistency with respect to interventional distributions would allow one to jointly reason about evidence across multiple levels of granularity while respecting the underlying cause-effect relationships. In this paper, we introduce a first framework for causal abstraction learning between SCMs based on the formalization of abstraction recently proposed by Rischel (2020). Based on that, we propose a differentiable programming solution that jointly solves a number of combinatorial sub-problems, and we study its performance and benefits against independent and sequential approaches on synthetic settings and on a challenging real-world problem related to electric vehicle battery manufacturing.
Download here
Fabio Massimo Zennaro, Paolo Turrini, Theodoros Damoulas. Published in IJCAI 2023 (International Joint Conference on Artificial Intelligence), 2023
Structural causal models provide a formalism to express causal relations between variables of interest. Models and variables can represent a system at different levels of abstraction, whereby variables and relations may be coarsened and refined according to the need of an agent or a modeller. However, to switch between different levels of abstraction requires evaluating the trade-off between the consistency and the information loss among models at different levels of abstraction. In this paper we introduce a family of interventional measures that an agent or a modeller may use to evaluate such a trade-off. We analyze the properties of these measures, and propose algorithms to evaluate and learn causal abstractions. Finally, we illustrate the flexibility of our setup by empirically showing how different measures and algorithmic choices may lead to different abstractions.
Download here
Yorgos Felekis, Fabio Massimo Zennaro, Nicola Branchini, Theodoros Damoulas. Published in CLeaR 2024 (Causal Learning and Reasoning), 2024
Causal abstraction (CA) theory establishes formal criteria for relating multiple structural causal models (SCMs) at different levels of granularity by defining maps between them. These maps have significant relevance for real-world challenges such as synthesizing causal evidence from multiple experimental environments, learning causally consistent representations at different resolutions, and linking interventions across multiple SCMs. In this work, we propose COTA, the first method to learn abstraction maps from observational and interventional data without assuming complete knowledge of the underlying SCMs. In particular, we introduce a multi-marginal Optimal Transport (OT) formulation that enforces do-calculus causal constraints, together with a cost function that relies on interventional information. We extensively evaluate COTA on synthetic and real world problems, and showcase its advantages over non-causal, independent and aggregated COTA formulations. Finally, we demonstrate the efficiency of our method as a data augmentation tool by comparing it against the state-of-the-art CA learning framework, which assumes fully specified SCMs, on a real-world downstream task.
Download here
Fabio Massimo Zennaro and Kai Olav Ellefsen. Published in NORA 2024 (Norwegian Artificial Intelligence Research Consortium Conference), 2024
Recent work in machine learning and artificial intelligence has dealt with bringing together two fundamental concepts underlying intelligence: causality and abstraction. Given two SCMs representing the same system at different levels of detail, learning an abstraction between them turned out to be a hard combinatorial problem, and solutions have been proposed relying on neural networks and optimal transport. In this paper, we consider the original problem of abstraction learning and we generalize it. We start highlighting a connection to the standard bin packing problem, and then we show that the problem at hand is actually an instance of a larger class of non-decomposable product of combinatorial problems. We propose an approach to solving these problems relying on genetic algorithms. Compared with a gradient descent approach, genetic algorithms do not require a relaxation of the problem and they can explore different regions of the solution space without getting stuck in sub-optimal minima. Furthermore, we will also show that the compositional form of the problem suits well the schemata hypothesis for genetic algorithm.
Download here
Fabio Massimo Zennaro, Nicholas Bishop, Joel Dyer, Yorgos Felekis, Anisoara Calinescu, Michael Wooldridge, Theodoros Damoulas. Published in UAI 2024 (Conference on Uncertainty in Artificial Intelligence), 2024
Multi-armed bandits (MAB) and causal MABs (CMAB) are established frameworks for decision-making problems. The majority of prior work typically studies and solves individual MAB and CMAB in isolation for a given problem and associated data. However, decision-makers are often faced with multiple related problems and multi-scale observations where joint formulations are needed in order to efficiently exploit the problem structures and data dependencies. Transfer learning for CMABs addresses the situation where models are defined on identical variables, although causal connections may differ. In this work, we extend transfer learning to setups involving CMABs defined on potentially different variables, with varying degrees of granularity, and related via an abstraction map. Formally, we introduce the problem of causally abstracted MABs (CAMABs) by relying on the theory of causal abstraction in order to express a rigorous abstraction map. We propose algorithms to learn in a CAMAB, and study their regret. We illustrate the limitations and the strengths of our algorithms on a real-world scenario related to online advertising.
Download here
Joel Dyer, Nicholas Bishop, Yorgos Felekis, Fabio Massimo Zennaro, Anisoara Calinescu, Theodoros Damoulas, Michael Wooldridge. Published in NeurIPS 2024 (Conference on Neural Information Processing Systems), 2024
Agent-based simulators provide granular representations of complex intelligent systems by directly modelling the interactions of the system’s constituent agents. Their high-fidelity nature enables hyper-local policy evaluation and testing of what-if scenarios, but is associated with large computational costs that inhibits their widespread use. Surrogate models can address these computational limitations, but they must behave consistently with the agent-based model under policy interventions of interest. In this paper, we capitalise on recent developments on causal abstractions to develop a framework for learning interventionally consistent surrogate models for agent-based simulators. Our proposed approach facilitates rapid experimentation with policy interventions in complex systems, while inducing surrogates to behave consistently with high probability with respect to the agent-based simulator across interventions of interest. We demonstrate with empirical studies that observationally trained surrogates can misjudge the effect of interventions and misguide policymakers towards suboptimal policies, while surrogates trained for interventional consistency with our proposed method closely mimic the behaviour of an agent-based model under interventions of interest.
Download here
Published:
This talk provides a brief introduction on the life and the time of Unamuno, and then it discusses its philosophy of religion as presented in his main work, The Tragic Sense of Life.
Published:
This talk provides a simple conceptual introduction to the topic of deep learning. It coarsely traces the development of neural network models, and it tries to clarify ideas, architectures, and the relationship between them.
Published:
Sparse filtering is an algorithm for unsupervised learning proposed in 2011. The authors introduced this algorithm as a paradigm of feature distribution learning, contrasting it with more traditional data distribution learning. In this seminar, we will explore the ideas behind sparse filtering following the original paper published in 2011. We will first discuss the general idea of feature distribution learning; then, we will present the specific algorithm for sparse filtering; finally, we will conclude with a discussion of the algorithm and a summary of further developments since the publication of the original paper.
Published:
This talk provides an overview of the problem of disentagling emotional speech features and it offers an analysis of several approaches to tackle this problem along with preliminary results.
Published:
This talk is meant to be a simple introduction to Principe’s framework for Information Theoretic Learning. We will first review a standard information theoretic measure, going through its derivation, its properties and its limitation. We will then derive a more general form of this information theoretic measure, and we will use it to compute statistical estimators. Finally, we will define an information theoretic loss function that can be used for learning.
Published:
This talk provides an overview of some topics at the intersection of cybersecurity and machine learning with the aim of illustrating the possibilities offered by machine learning and surveying recent promising lines of research at the border between the two disciplines. The first part of the talk gives a brief introduction to machine learning from a conceptual point of view. The second part then explores research topics in three main domains: applications of machine learning to security; security aspects of machine learning; and, finally, safety concerns related to machine learning.
Published:
This talk provides an overview of the problem of aggregating several probablistic structural causal models and it offers a walkthrough of our algorithm applied to a toy case scenario.
Published:
This talk provides an overview of the research in the fields of adversarial machine learning and AI safety. The first part of the talk gives a brief introduction to machine learning from a conceptual point of view; the second and the third part respectively illustrates some representative attacks and defenses for machine learning systems; and, finally, the last part lists safety concerns related to machine learning and artificial intelligence. (This presentation has some overlap with the previous talk “Research Challenges for Applying Machine Learning in Cybersecurity”)
Published:
This talk provides a short presentations of three perspectives to explore the intersection of cybersecurity and machine learning. It examines an instrumental perspective (in which ML is seen as a tool), a systemic perspective (in which ML is seen as component of a system to defend), and a societal perspective (in which ML is seen as a part of societal processes). Each perspective is reconnected with specific areas of research (cybersecurity, adversarial learning, AI safety).
Published:
n this presentation we are going to introduce causal model from the point of view of computer science, following the approach based on structural causal models proposed by Pearl. We will start by showing the place of causality theory and by discussing its relationship with standard statistics. We will then present graphical models (directed acyclic grpahs, Bayesian networks, causal Bayesian networks, and structural causal models) to address causal questions. We will then review some paradigmatic problems that arise in the field of causality, and how they can be solved.
Published:
This talk aims at providing an overall understanding of the role of causal modelling, and its relationship to machine learning. We are going to introduce casual models following the popular approach based on structural causal models proposed by Pearl, and show how they can capture the notion of causal relations. We will consider paradigmatic casual problems (causal inference and causal discovery) and discuss how they can be tackled. Finally, we will briefly explore connections between causality and machine learning, touching on topics such as learning with causal assumptions, using counterfactuals to assess fairness, and expressing reinforcement learning problems in causal terms.
Published:
These slides present the research project of modelling hacking in the form of capture-the-flag (CTF) games as problems solvable by agents trained by reinforcement learning (RL). The main assumptions and challengese are presented, along with some preliminary results.
Published:
This short presentation introduces the method of information bottleneck by describing its formulation and by illustrating its application in analyzing the behaviour of deep neural networks. The presentation ends discussing the problem of using a similar information-theoretic method to study the behaviour of unsupervised learning algorithms, focusing in particular on the analysis of the sparse filtering algorithm.
Published:
This talk provides an overview of the problem of aggregating several probablistic structural causal models and it offers a walkthrough of our algorithm applied to a toy case scenario.
Published:
This talk briefly summarizes several observations on the political value of adopting machine learning systems in criminal justice presented through the lens of Left Realism.
Published:
These slides provides a quick conceptual introduction to neural networks for supervised learning, and review some hypothesis and theories meant to explain the generalization performance of learning. The presentation then focuses on one of these possible interpretative frameworks, information bottleneck, and discusses its possible application to understand the dynamics of unsupervised learning algorithms, such as sparse filtering.
Published:
These slides provide an overview on the topic of the security of machine learning systems. We identify the two main attack surfaces inherent in machine learned systems, and we then provide a review of the main attack and defenses, heavily relying on analogical reasoning to illustrate and explain these methods. The presentation ends with remarks on the practical implications of these vulnerabilities and the current directions of research.
Published:
Structural causal models (SCMs) constitute a rigorous and tested formalism to deal with causality in many fields, including artificial intelligence and machine learning. Systems and phenomena may be modelled as SCMs and then studied using the tools provided by the framework of causality. A given system can, however, be modelled at different levels of abstraction, depending on the aims or the resources of a modeller. The most exemplar case is probably statistical physics, where a thermodynamical system may be represented both as a collection of microscopic particles or as a single body with macroscopic properties. In general, however, switching between models with different granularities presents non-trivial challenges and raises questions of consistency. These slides will first provide a brief introduction to SCMs, and then consider how we can express the problem of relating SCMs representing the same phenomenon at different levels of abstraction. Finally, we will discuss open challenges and present some existing solutions, as well as pointing towards possible future directions of research.
Published:
These slides provide a synthetic overview of the problem of relating structural causal models (SCMs) at different levels of abstraction. We define the problem and discuss the desiderata of our solution. We present a few of the existing formalizations and solutions offered in the literature. We then conclude highglighting interesting future direction of research in this area.
Published:
These slides analyze the application of reinforcement learning for modelling the problem of penetration testing in computer security. After a conceptual overview of reinforcement learning, we discuss which are the specific challenges in modeling penetration testing as a game that may be solved by a reinforcement learning agent. Finally, we present some of the work done by the research group at the University of Oslo on this topic, including conceptual modelling and preliminary practical implementations of reinforcement learning environments and agents.
Published:
In this presentation we consider the problem of relating causal models representing the same phenomenon or system at different levels of abstraction. A given system may be represented with more or less details according to the resources or the need of a modeler; switching between descriptions at different levels of abstraction is not trivial, and it raises questions of consistency. In this presentation, we will focus in particular on structural causal models (SCM) and we will express properties of consistency in this context. We will then present two formalisms for defining a relation of abstraction between SCMs: an approach based on the definition of a transformation between the outcomes of models, and an approach based on the definition of a mapping between the structure of models. We will then conclude with some observations and some questions regarding this current direction of research.
Published:
In this presentation we first offer a review of definitions of abstractions proposed in the literature, and then we propose a framework to align these definitions and evaluate their properties. We suggest analyzing abstractions on two layers (a structural layer and a distributional layer) and we review some basic properties that may be enforced on maps defined on each layer. We suggest that this framework may contribute to a better understanding of different forms of abstraction, as well as providing a way to tailor application-specific definitions of abstraction.
Published:
Causal models can represent an identical system or phenomenon at different levels of abstraction. In this talk, we will focus on structural causal models (SCM) and review two frameworks which have been proposed to express a relation of abstraction between SCMs and to measure the interventional consistency of an abstraction. We will then discuss some current directions of research, including the problem of learning abstractions.
Published:
In this presentation we review the definition of abstraction between structural causal models and we frame the problem of learning a mapping between them. We discuss the challenges of learning a causal abstraction that minimizes the abstraction error in terms of interventional consistency. We then suggest an approach based on a relaxation and parametrization of the problem, leading to a solution based on differentiable programming. The solution approach is evaluated both on synthetic and real-world data.
Published:
In this talk we will introduce one of the most important formalisms to represent causal systems in computer science. We will start with a brief review of causality, highlighting the meaning of causal queries and the limitations of standard statistics and machine learning in answering them. To address these shortcomings, we will present the formalism of structural causal models (SCMs). We will then show how these models can be used to rigorously answer different types of causal questions, including observational, interventional and counterfactual questions. Finally, we will conclude by discussing how this formalization gives rise to a rich theory of causality, and how the ideas underlying causality have strong and promising intersections with artificial intelligence and machine learning.
Published:
Modelling complex systems and processes, such as battery manufacturing, is a significant scientific and technical challenge. Mathematics, statistics and machine learning provide useful tools to tackle this problem. In this talk, we will focus on the recent formalism of structural causal models (SCM) and casual abstractions (CA). We will first offer a high-level introduction to SCMs and CA, discussing in particular their importance and relevance for modelling. We will then make a reference to our original methodology for learning CAs. Finally, we will showcase our preliminary results on the problem of modelling one stage of the lithium-ion battery manufacturing process, demonstrating the potential for integrating data collected by different research groups.
Published:
In this presentation we quickly review the idea of defining an abstraction between structural causal models and we present the standard measure of abstraction error proposed in the literature. We then consider some potential limitations when using this single measure to assess or learn abstractions. To overcome this limit, we propose an extension of the original definition of abstraction approximation, we derive new measures of abstraction error, and we discuss theoretical and applied properties of these new measures.
Published:
In this talk we discuss how rigorous relations between causal models may be defined and quantitatively evaluated. We will start with a quick introduction to the popular formalism of structural causal models. Next, we will review alternative proposals for expressing relations of abstractions between these models. We will then focus on one particular framework, and show how a notion of abstraction error can be introduced in this setup. Finally, we will discuss some of the limitations of this measure, and how alternative measures of error may be developed in order to capture different aspects of abstraction and fit different aims. We will conclude with a few considerations about possible future developments of this theory of abstraction.
Published:
In this presentation we review the definition of structural causal models and we introduce the problem of relating these models via an abstraction map. We formalize the problem of learning such a causal abstraction map as a minimizer of an abstraction error expressed in terms of interventional consistency, and we discuss some of the challenges involved in this optimization problem. We then present an approach based on a relaxation and parametrization of the problem, leading to a solution based on differentiable programming. The solution approach is evaluated both on synthetic and real-world data.
Published:
In this presentation we review the definition of structural causal models and we introduce the problem of relating these models via an abstraction map. We formalize the problem of learning such a causal abstraction map as a minimizer of an abstraction error expressed in terms of interventional consistency, and we discuss some of the challenges involved in this optimization problem. We then present an approach based on a relaxation and parametrization of the problem, leading to a solution based on differentiable programming. The solution approach is evaluated both on synthetic and real-world data.
Published:
In this talk we give a quick overview of the Machine Learning group at UiB and of its research interests and directions.
Published:
This talk will touch on two aspects of human reasoning that are crucial for AI: causal reasoning and multi-level reasoning. We will discuss the role and the relevance of causal models in machine learning systems, as well as the ubiquity of multi-level reasoning across applications. We will then review some recent approaches based on causal abstraction, aimed at integrating causal reasoning and multi-level reasoning. We will conclude by suggesting possible developments and deployment of these methods.
Published:
This talk reviews the basic notions of causality and causal abstraction, defines the problem of causal abstraction learning, and suggests a solution based on genetic algorithms.
Published:
This talk provides a conceptual introduction to machine learning for a multidisciplinary audience.
Published:
Multi-armed bandits are a standard formalism to represent simple yet realistic decision-making problems in which a policy-maker has to find an optimal balance between choosing well-known options or exploring new alternatives. Traditionally, such decision-making problems are encoded using a single model; however, in reality, a decision-maker may have multiple related models of the same problem at different level of resolution, each one providing information about the value and the effects of the available choices. In this talk we will recall the standard multi-armed bandits framework, extend it to a causal settings, and explain how multiple models can be related via causal abstractions. Finally, we will discuss a few theoretical results about transporting information across the models via abstraction using basic algorithms inspired by reinforcement learning.
Published:
This talk introduces the formalism of structural causal models, it explains the idea of abstractions between causal models, provides the definition of alpha-abstractions, and describes a solution for abstraction learning.
Undergraduate course, University of Manchester, 2014
Teaching assistance in Introduction to Machine Learning
Undergraduate course, University of Manchester, 2015
Teaching assistance in Digital Biology
Undergraduate course, University of Manchester, 2015
Teaching assistance in Modelling and Visualization of High Dimensional Data
Undergraduate course, University of Manchester, 2016
Teaching assistance in Modelling and Visualization of High Dimensional Data
Undergraduate and graduate course, University of Oslo, 2020
Guest lecturer for the class on Unsupervised Learning.
Undergraduate and graduate course, University of Oslo, 2020
Guest lecturer for the class on Unsupervised Learning and contributor to the class on Ethics and Future Perspectives of Machine Learning.
Undergraduate and graduate course, OsloMet University, 2020
Guest lecturer for a class providing a brief introduction on the key concepts of machine learning and a short tutorial on using scikit-learn to instantiate models and train them.
Graduate course, University of Bergen, 2024
Lecturer for the course INF368A Selected Topics in Machine Learning. The course covers the foundation of Reinforcement Learning, from modelling with Markov Decision Processes to applications of Deep Reinforcement Learning.
Undergraduate course, University of Bergen, 2024
Guest lecturer for the class on Reinforcement Learning.
Graduate course, University of Bergen, 2025
Lecturer for the course INF266 Reinforcement Learning. The course covers the foundation of Reinforcement Learning, from modelling with Markov Decision Processes to applications of Deep Reinforcement Learning.