Leading the evolution of AI

Every day, data scientists
find ways to solve more of
the world’s most challenging
problems with AI.

The Faculty Research Lab exists to push the boundaries of what AI can do, so we cancontinue to provide our customers with the safest, highest performing AI in the world. We’re not content with being the best AI provider in the market. We’re here to be the best in the world at expanding and exploring AI’s potential.

£1m+ invested into
AI safety research

Top-PhD awards from
Harvard and Cambridge

Research papers published
in all major AI conferences

In partnership with:   

How do we shape the future of AI?

Collaborations with
top universities

keep us in touch with
advances from academia

Research papers

in the best academic venues
help us share our advances
with the wider community

Developing new technology
for customers

means our research is
road-tested in the real world

Machine Learning research

We push forward the frontier
of what AI can do in areas
that are foundational to the
performance of AI systems
applied in industry today.

Our focus has been on researching novel techniques that improve (probabilistic) models for many types of AI systems and the way in which data is represented by those models.

We’ve used these developments to significantly improve the performance and safety of our customers’ AI systems across all data types: voice fingerprinting, document processing, image representation, and tabular data processing.


Invariant-equivariant representation learning for multi-class data

A representation-learning technique to separate information about the categories in data versus the continuous set of styles.

JUNIPR: a Framework for Unsupervised Machine Learning in Particle Physics

A natural and interpretable approach to unsupervised learning with widespread applications in particle physics. Published in European Physical Journal.

Binary JUNIPR: an interpretable probabilistic model for discrimination

A demonstration that JUNIPR provides both interpretability and state-of-the-art performance in the critical classification tasks of the field. Published in Physical Review Letters.

Learning disentangled representations with the Wasserstein Autoencoder

A simpler method to learn disentangled representations. Published in ECML.

Learning Deep-Latent Hierarchies by Stacking Wasserstein Autoencoders

An approach to learning complex (deep, hierarchical) data representations without the need for specialised model-architecture tuning.

Gaussian mixture models with Wasserstein distance

A concrete demonstration of how Optimal Transport can help when modelling with discrete latent variables. Published in ACML.

Improving latent variable descriptiveness with AutoGen

Language is so complex that latent variable models in natural language processing often don’t represent information meaningfully. We show that a commonly-applied tweak can be interpreted as a novel, well-justified probabilistic model. Published in ECML.

AI Safety research

We have made advances in
AI safety where previous
state-of-the-art techniques
were insufficient.

With a clear focus on application in high-stakes, real-world problems, these developments are being used everyday in critical applications from Healthcare to Financial Services, helping humans engage with AI systems to make better decisions.

We study the risks of AI systems across the four main pillars of AI Safety:


understanding how a model
makes its predictions


ensuring that AI systems
keep private data private in
development and production


measuring and mitigating
bias against protected
groups in AI systems


assuring that an AI system
will work as expected
in the real world


Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability

A mathematically justified method for incorporating causality into Shapley values, overcoming a fundamental limitation of current day explainability. Published in NeurIPS.

Shapley explainability on the data manifold

A demonstration of how to calculate Shapley values, the most commonly used method to explain black-box models, without making the flawed assumption that features in the data are independent, developing the first general, scalable approach to correctly calculating Shapley values. Published in ICLR.

Human-interpretable model explainability on high-dimensional data

A framework for explaining black-box models in terms of the high-level features (e.g. people’s characteristics, objects) that humans care about, rather than the low-level features (e.g. pixels) that the models use.

Explainability for fair machine learning

The first demonstration of directly explaining bias in AI models. Submitted.

Learning to Noise: Application-Agnostic Data Sharing with Local Differential Privacy

A method to share high-dimensional data with the powerful guarantees of Local Differential Privacy. Submitted.