Gradient-Based Multi-Objective Optimization and its Application in Large Language Models

IJCAI 2025 Tutorial Session T11 @ Guangzhou, China

Overview

The evaluation of deep learning models often involves navigating trade-offs among multiple criteria. This tutorial provides a structured overview of gradient-based multi-objective optimization (MOO) for deep learning models. We begin with the foundational theory, systematically exploring three core solution strategies: identifying a single balanced solution, finding a discrete set of Pareto optimal solutions, and learning a continuous Pareto set. We will cover their algorithmic details, convergence, and generalization. The second half of the tutorial focuses on applying MOO to Large Language Models (LLMs). We will demonstrate how MOO offers a principled framework for fine-tuning and aligning LLMs, effectively navigating trade-offs between multiple objectives. Through practical demonstrations of state-of-the-art methods, participants will gain valuable insights. The session will conclude by discussing emerging challenges and future research directions, equipping attendees to tackle multi-objective problems in their work. This tutorial is based on our survey paper Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond.

Speakers

Weiyu Chen

Weiyu Chen

PhD Student at HKUST

Multi-objective optimization

Baijiong Lin

Baijiong Lin

PhD Student at HKUST(GZ)

Multi-task learning and LLM post-training

Xiaoyuan Zhang

Xiaoyuan Zhang

PhD Student at CityU

Multi-objective optimization

Xi Lin

Xi Lin

Postdoc at CityU

Multi-objective optimization

Han Zhao

Han Zhao

Assistant Professor at UIUC

Multi-objective optimization

Schedule

Session 1: Introduction and Advanced Methods

Introduction to MOO in Deep Learning

  • Motivation and problem formulation
  • Key concepts: Pareto optimality, dominance, Pareto front
  • Traditional approaches and limitations

Finding a Single Pareto Optimal Solution

  • Loss balancing methods
  • Gradient weighting methods
  • Gradient manipulation methods
  • Practical speedup strategies

Finding a Finite Set of Solutions

  • Methods based on preference vectors
  • Methods without preference vectors

Finding an Infinite Set of Solutions

  • Network structures
  • Training strategies

Session 2: Theories, Applications, and Beyond

Theoretical Foundations

  • Convergence analysis (deterministic vs. stochastic)
  • Generalization bounds

Application in LLM

  • Multi-objective fine-tuning
  • Multi-objective alignment
  • Multi-objective test-time alignment

Open Source Libraries

  • LibMTL
  • LibMOON

Open Challenges and Future Directions

  • Theoretical understanding
  • Reducing computational costs
  • Handling large number of objectives
  • Distributed training

Concluding Discussion

Venue

๐Ÿ“… Date & Time: August 29th, 2025

โฑ๏ธ Duration: Two 1:45h slots - Full Afternoon

๐Ÿ“ Location: Langham Place, Guangzhou, China

๐Ÿšช Room: TBA

Materials

Contact

For any questions regarding the tutorial, please contact Weiyu Chen.