Welcome!

The Workshop on Causality and Large Models (C♥️LM) will be co-located with NeurIPS 2024, in Vancouver, and held on December, 2024.

About the workshop

The remarkable capabilities and accessibility of recent large models, also known as “foundation models,” have sparked significant interest and excitement in the research community and beyond. In particular, large pre-trained generative models have demonstrated remarkable competencies in understanding and generating human-like text despite being trained on largely unstructured data using relatively simple self-supervised learning objectives. This raises the question: (A) Why do such large models work so well?

The impressive performance, sometimes even exceeding human experts, across a wide variety of benchmarks, together with the incorporation of multiple modalities such as images, text, and audio, makes these large models particularly versatile decision-making systems. However, the increased adoption of these models is not without challenges. The increasing size and complexity of these “black box” models raises concerns about their trustworthiness and reliability. For real-world applications, where distribution shifts are pervasive and sufficient high-quality data may be difficult or expensive to collect, it is crucial to systematically verify and enhance the robustness and generalization capabilities of these models. This is especially pertinent in safety-critical domains, such as healthcare and policy-making. Consequently, we must consider: (B) Under what circumstances can we trust these large models and how can this be improved?

Enter causality: a systematic framework to formalize “why?” and “how?” questions much like (A) or (B) and develop principled tools to address them. Causal inference is a powerful approach to describe a system’s behavior under interventions and reason over counterfactual scenarios. By relying on stable causal relationships, instead of potentially spurious statistical correlations, causal models can transparently elucidate a system’s behavior and enable performance guarantees beyond the training distribution, which is crucial for high-risk applications. However, translating the rigorous theoretical tools of causality into practical methods, especially in the large-scale regime with heterogeneous unstructured data as in large models, remains a notable challenge, despite the growing attention by the community.

With the striking potential of causality and the enormous interest in tackling the many open questions about understanding and improving large models on the other, we propose a workshop that aims to explore the many exciting synergies between causality and large models. Specifically, we identify four main directions to cover in our workshop:

We would like to acknowledge that a related workshop, Are Large Language Models Simply Causal Parrots?, held at AAAI 2024, shared some of our objectives.

Important Dates

Contact us at calmworkshop2024@gmail.com