Gradient-Based Multi-Objective Optimization and its Application in Large Language Models

IJCAI 2025 @ Montreal, Canada and Guangzhou, China

Overview

The evaluation of deep learning models often involves navigating trade-offs among multiple criteria. This tutorial provides a structured overview of gradient-based multi-objective optimization (MOO) for deep learning models. We begin with the foundational theory, systematically exploring three core solution strategies: identifying a single balanced solution, finding a discrete set of Pareto optimal solutions, and learning a continuous Pareto set. We will cover their algorithmic details, convergence, and generalization. The second half of the tutorial focuses on applying MOO to Large Language Models (LLMs). We will demonstrate how MOO offers a principled framework for fine-tuning and aligning LLMs, effectively navigating trade-offs between multiple objectives. Through practical demonstrations of state-of-the-art methods, participants will gain valuable insights. The session will conclude by discussing emerging challenges and future research directions, equipping attendees to tackle multi-objective problems in their work. This tutorial is based on our survey paper Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond.

Speakers

Weiyu Chen

Weiyu Chen

PhD Student at HKUST

Multi-objective optimization

Baijiong Lin

Baijiong Lin

PhD Student at HKUST(GZ)

Multi-task learning and LLM post-training

Xiaoyuan Zhang

Xiaoyuan Zhang

PhD Student at CityU

Multi-objective optimization

Xi Lin

Xi Lin

Postdoc at CityU

Multi-objective optimization

Han Zhao

Han Zhao

Assistant Professor at UIUC

Multi-objective optimization

Schedule

Session 1: Foundations and Single-Solution Methods

Introduction to MOO in Deep Learning

  • Motivation and problem formulation
  • Key concepts: Pareto optimality, dominance, Pareto front
  • Traditional approaches and limitations

Finding a Single Pareto Optimal Solution

  • Loss balancing methods
  • Gradient weighting methods
  • Gradient manipulation methods
  • Practical speedup strategies

Finding a Finite Set of Solutions

  • Methods based on preference vectors
  • Methods without preference vectors

Finding an Infinite Set of Solutions

  • Network structures
  • Training strategies

Session 2: Advanced Methods and Applications

Theoretical Foundations

  • Convergence analysis (deterministic vs. stochastic)
  • Generalization bounds

Application in LLM

  • Multi-objective fine-tuning
  • Multi-objective alignment
  • Multi-objective test-time alignment

Open Source Libraries

  • LibMTL
  • LibMOON

Open Challenges and Future Directions

  • Theoretical understanding
  • Reducing computational costs
  • Handling large number of objectives
  • Distributed training

Concluding Discussion

Venue

๐Ÿ›๏ธ Main Venue - Montreal, Canada

๐Ÿ“… Date & Time: TBA

โฑ๏ธ Duration: Two 1:45h slots

๐Ÿ“ Location: Montreal, Canada

๐Ÿšช Room: TBA

๐Ÿ›ฐ๏ธ Satellite Venue - Guangzhou, China

๐Ÿ“… Date & Time: TBA

โฑ๏ธ Duration: Two 1:45h slots

๐Ÿ“ Location: Guangzhou, China

๐Ÿšช Room: TBA

Materials

Contact

For any questions regarding the tutorial, please contact Weiyu Chen.