Overview
Objectives
This course provides practical experience in managing Generative AI systems and develops skills to assess Large Language models. It focuses on the deployment, monitoring, optimization, and scaling of Generative AI applications in production using Generative AI Operations (GenAIOps).
A light touch on Machine Learning Operations (MLOps) ensures foundational understanding before diving into LLM-specific challenges such as performance tuning, cost efficiency, observability, retrieval augmented generation (RAG), LLM Agents, and prompt management.
Participants will gain experience with common (Commercial and open-source) cloud-based LLM tooling and platforms, ensuring they can build scalable, efficient, and cost-effective LLM applications.
Audience
Learning outcomes
This course offers participants key competencies in deploying, monitoring, optimizing, and scaling large language models (LLMs) within operational environments.
Here are the key competencies provided by the course:
- Foundational Knowledge of MLOps and LLMOps: Understanding the lifecycle of LLMs including pre-training, fine-tuning, inference, and monitoring; distinguishing between MLOps and LLMOps.
- Deployment Strategies for LLMs: Learning various hosting options, tackling scaling issues like model parallelism, and exploring cost optimization strategies such as caching.
- Prompt Engineering and Management: Gaining skills in prompt engineering for enhanced efficiency, accuracy, and cost-effectiveness; managing and versioning prompts to ensure consistency and maintainability.
- Performance and Cost Optimization: Techniques to optimize inference speed and reduce operational costs.
- Observability and Monitoring: Setting up systems to track LLM performance metrics such as latency and token usage, detecting errors, and implementing feedback loops mechanisms.
Programme
Day 1. GenAIOps/LLMOps Fundamentals
1. Introduction to MLOps and LLMOps
- MLOps vs. GenAIOps/LLMOps: Key differences and challenges
- LLM lifecycle: Pre-training, fine-tuning, inference, and monitoring
- Challenges in deploying LLMs: Scalability, latency, cost, observability
2. GenAIOps/LLMOps Lifecycle
- Foundational models
- Fine Tuning
- LLM Deployment strategies and infrastructure
- Scaling challenges: Model parallelism, quantization, and distillation
- Cost optimization strategies: Caching, batch inference, serverless deployments, Optimizing inference speed
3. Hands-on Session: LLM Finetuning
- FineTune a small LLM
4. Prompt Engineering
- Prompt Engineering Basics
- Optimizing prompts for efficiency, accuracy, and cost
- Structuring prompts for different use cases
- Fine-tuning vs. prompt engineering: When to use which
5. Prompt Management
- Why prompt management matters: Consistency, scalability, and maintainability
- Versioning and tracking prompts: Best practices
- Using prompt management tools
- A/B testing prompts: Measuring effectiveness and iterating
6. Hands-on Session: Prompts Engineering & Management
- Experimenting with different prompt engineering techniques and tools
Day 2. Advanced LLMOps
1. Retrieval augmented generation (RAG)
- What is RAG and how does it work?
- Exploring the different types of RAG
- Common issues in RAG solutions
- When to use Fine tuning vs prompt engineering vs RAG
2. Agentic workflows
- What are LLM Agents?
- Key components of LLM Agent with an overview on opensource frameworks such as LangChain and LlamaIndex
- Use cases
3. Hands-on: Building a GenAI Application
- Building a RAG and Agent application
- Implementing monitoring & logging for performance tracking
- Optimizing deployment for cost and scalability
4. Observability, Monitoring, and Feedback Loops
- Tracking LLM performance: Latency, token usage, response quality
- Detecting hallucinations and errors
- Implementing feedback loop mechanisms
5. Security, Compliance, and Responsible AI
- Data privacy in LLMOps: PII redaction, secure API handling
- Regulatory compliance
- Bias detection and mitigation strategies
Registration
Registration deadline
Fees:
- CHF 1300.- for the full course
- CHF 650.- (reduced rate)
- An additional CHF 200.- for the Micro-credential
Curriculum
Period
Microcertification
Date(s)
Description
1. Introduction to MLOps and LLMOps
- MLOps vs. GenAIOps/LLMOps: Key differences and challenges
- LLM lifecycle: Pre-training, fine-tuning, inference, and monitoring
- Challenges in deploying LLMs: Scalability, latency, cost, observability
2. GenAIOps/LLMOps Lifecycle
- Foundational models
- Fine Tuning
- LLM Deployment strategies and infrastructure
- Scaling challenges: Model parallelism, quantization, and distillation
- Cost optimization strategies: Caching, batch inference, serverless deployments, Optimizing inference speed
3. Hands-on Session: LLM Finetuning
- FineTune a small LLM
4. Prompt Engineering
- Prompt Engineering Basics
- Optimizing prompts for efficiency, accuracy, and cost
- Structuring prompts for different use cases
- Fine-tuning vs. prompt engineering: When to use which
5. Prompt Management
- Why prompt management matters: Consistency, scalability, and maintainability
- Versioning and tracking prompts: Best practices
- Using prompt management tools
- A/B testing prompts: Measuring effectiveness and iterating
6. Hands-on Session: Prompts Engineering & Management
- Experimenting with different prompt engineering techniques and tools
Speakers
Date(s)
Description
1. Retrieval augmented generation (RAG)
- What is RAG and how does it work?
- Exploring the different types of RAG
- Common issues in RAG solutions
- When to use Fine tuning vs prompt engineering vs RAG
2. Agentic workflows
- What are LLM Agents?
- Key components of LLM Agent with an overview on opensource frameworks such as LangChain and LlamaIndex
- Use cases
3. Hands-on: Building a GenAI Application
- Building a RAG and Agent application
- Implementing monitoring & logging for performance tracking
- Optimizing deployment for cost and scalability
4. Observability, Monitoring, and Feedback Loops
- Tracking LLM performance: Latency, token usage, response quality
- Detecting hallucinations and errors
- Implementing feedback loop mechanisms
5. Security, Compliance, and Responsible AI
- Data privacy in LLMOps: PII redaction, secure API handling
- Regulatory compliance
- Bias detection and mitigation strategies
Intervenant-es
PhD Hisham MOHAMED, University of Geneva
Hisham is an AI and machine learning expert with over 10 years of experience in machine learning, software engineering, and big data. With a PhD in Computer Science from the University of Geneva, he has led high-impact projects and built and managed diverse teams.
Hisham has deep experience in deploying and scaling AI systems. In this session, he will focus on GenAIOps/LLMOps, sharing insights on managing, optimizing, and operationalizing large language models in real-world applications.
Director(s)
Prof. Giovanna DI MARZO SERUGENDO, Centre universitaire d'informatique (CUI), University of Geneva