Programme

PyCon SG 2026 Schedule

19th - 21st June 2026

0815 - 0900 Registration
0900 - 0915 Day 1 Opening
0915 - 1000
Venue: Event Hall 1-1 + Event Hall 1-2
Keynote 1: So Kiasu, Still Kena Replaced by AI?
Director / Vice Chair - Python Software Foundation, Georgi Ker

Many of us in Asia grew up with the same advice: study hard, choose the right path, and your future will be secure. We did everything right. Extra tuition. Good grades. Stable job. And somehow AI still showed up to the interview. Today the rules are shifting. AI is changing how work evolves, teachers are rethinking how learning happens, and Gen Z is entering the workforce with very different expectations about what a career looks like. In this keynote, I share a story that is part career survival guide and part love letter to everyone who ever felt the playbook wasn't written for them. I'll talk about AI, Gen Z, teachers, open source, and why the most important thing I ever learned wasn't in any curriculum. This talk explores what it really means to stay relevant in an AI-shaped world, and why the future may belong not to those who study once, but to those who keep learning together.

1000 - 1045
Venue: Event Hall 1-1 + Event Hall 1-2
Keynote 2: Using Tools Without Being Used: The Four Stages of AI Fluency and What They Make Possible Together
Professor @ NUS; Director Google-NUS Innovation and Research Center, Anthony Tung

Most conversations about AI at work stop at the copilot: the AI that sits beside you and speeds up what you already do. It is useful, but it is only the first stage of a much longer progression — and professionals who settle there are quietly optimizing themselves into the category of work that AI will replace next. This talk offers a four-stage model of AI fluency — monolingual silos, AI bilingualism, AI trilingualism, and AI multilingualism — and argues that the capacity to move between stages rests on a single habit of mind, stated at two levels. At the individual level, the habit is to use tools without being used by them. I will draw a sharp line between AI-using (faster execution within a defined role) and AI thinking (how a person frames, decomposes, contests, and integrates what the machine produces), and show through concrete examples how a goal-oriented query and an exploratory one on the same question produce fundamentally different kinds of thought. At the collective level, the same habit becomes the precondition for harmony without uniformity. Only the professional who has released the goal-grip can truly sit with a colleague's framing without trying to convert it — letting the clinician's burden remain clinical and the accountant's burden remain financial while still holding them together. I will show how private-assistant culture quietly produces the opposite: uniform AI-mediated voice across a team that masks the disappearance of real disciplinary difference. Drawing on my DeepConnect research, I will argue that shared infrastructure — the kind that lets teams accumulate understanding over time rather than losing it between meetings — is what makes collective sensemaking scale without flattening, and I will locate this argument inside Singapore's AI-bilingual workforce agenda and the SkillsFuture conversation it is shaping. The divide ahead is not between those who use AI and those who don't. It is between those ruled by their tools, and those who, having stepped free of that rule, can finally connect through them.

1045 - 1100 Phototaking & Setup for concurrent sessions
1100 - 1130
Venue A - Event Hall 1-1
You Don’t Need an LLM to Understand Your Data
AI Software Developer - Cambridge University Press & Assessment, Cyrus Mante

Large Language Models dominate today’s NLP conversations, but they are often the wrong tool for the first and most important question: what is my data actually about? When working with thousands or millions of documents, jumping straight to prompting can be expensive, opaque, and misleading. In this talk, we revisit topic modeling as a practical, scalable, and interpretable way to uncover structure in large text collections. Rather than treating it as a relic of the past, we will explore how modern topic modeling fits naturally into today’s Python data stack and complements LLM-based systems.

Venue B - Event Hall 1-2
Merlions, Agents & Copilot: Trustworthy Python on Azure
Developer Engagement Lead ANZ & Asia - Microsoft, Michelle Mei Ling Sandford

Building real-world multiagent systems in Python doesn't have to be dry or mysterious! In this lively 30-minute session, you'll meet a cast of Singaporean-inspired Python agents: recommend hawker fare, track haze, and liven things up with a wisecracking Merlion. See how GitHub Copilot can supercharge development, while practical trust patterns and Azure deployment ensure your code stays friendly, robust, and observable, even when things don’t go as planned. Whether you're new to multiagent systems, exploring GitHub Copilot, or eager to represent Singapore in your code, you'll learn strategies for transparency and reliability that are as fun as they are practical.

1130 - 1200
Venue A - Event Hall 1-1
Designing Python APIs for Data You Don’t Control
Senior Developer Community Manager - Apify, Saurav Jain

The web isn’t an API, but Python developers often treat it like one. This talk explores how to design Python interfaces for unstable data sources, focusing on schema evolution, defensive parsing, and protecting downstream users.

Venue B - Event Hall 1-2
Tokenization in Transformers v5: Simpler, Clearer, and More Modular
Machine Learning Engineer - Hugging Face, Aritra Ray Gosthipaty

Tokenization is one of the most critical but least understood parts of modern NLP systems. For years, tokenizers in transformers were treated as opaque artifacts tied to pretrained checkpoints, making them hard to inspect, customize, or train from scratch. Transformers v5 introduces a major redesign that changes this paradigm. Tokenizers are now explicit, modular architectures, much like the PyTorch nn.Modules where structure is clearly separated from learned vocabulary and merge rules. In this talk, we’ll demystify how tokenization actually works in Transformers v5. We’ll walk through the tokenization pipeline, explain how the tokenizers Rust backend integrates with transformers, and show how the new class hierarchy enables inspection, customization, and training of model-specific tokenizers from scratch. By the end, attendees will understand how modern tokenizers are built, why the v5 redesign matters, and how to confidently work with tokenizers as first-class, configurable components rather than black boxes.

1200 - 1330 Lunch & Networking
1330 - 1400
Venue A - Event Hall 1-1
Inside CPython: How the Open Source Interpreter Really Works
Associate Software Engineer - Red Hat, Allen Yesudasan

CPython is the reference implementation of Python used by millions of developers, yet few understand what happens beneath the surface when a Python program runs. This talk provides a practical tour of the CPython interpreter, from source code to execution. We will explore how Python code is compiled into bytecode, how the virtual machine executes instructions, how memory management and reference counting work, and how core components like the Global Interpreter Lock influence performance. Attendees will gain a clearer mental model of Python’s internals and learn how this knowledge can improve debugging, performance tuning, and everyday development.

Venue B - Event Hall 1-2
From Audio to Ecology: A BirdNET Pipeline for Bird Monitoring in Singapore
Engineer, MVP Studio at NUS Enterprise, Aryn Choong Yue Lin

This project uses passive acoustic recordings from sites across Singapore and the BirdNET deep-learning model to detect bird species and track their calling activity over time. After filtering BirdNET outputs to improve reliability, the study estimates species richness, community composition, and diel activity patterns, compares these across different habitat types, and links variation in detections to environmental factors such as vegetation cover, road proximity, and weather.

1400 - 1430
Venue A - Event Hall 1-1
Building Practical HealthTech AI Systems with Python: Lessons from HealthPredictor.AI
Founding CTO, HealthPredictor.ai, Melvin Ng, RPh

Building AI systems in practice is very different from tutorials and demos. Beyond models and datasets, engineers must navigate messy data, system design trade-offs, and real deployment constraints. In this talk, I share the journey of building HealthPredictor.AI, an applied AI/ML startup initiative, and the lessons learned along the way. I will walk through how Python was used across the data pipeline, from data processing and model development to system integration. We will explore key challenges such as handling imperfect data, making pragmatic architectural decisions, and balancing accuracy with usability. This talk focuses on practical system design, trade-offs, and lessons learned from building an applied AI/ML system from a startup initiative perspective. This talk builds upon a version previously presented at FOSSASIA Summit, an international conference with over 5,000 attendees, and has been further expanded with additional Python-focused content and technical depth for the PyCon audience. Attendees will leave with practical insights they can apply to their own AI projects.

Venue B - Event Hall 1-2
How to Write Terrible Python Code (And Why It Works… Until It Doesn’t)
Software Architect - Python and Open Source - EPAM Systems, Vivek Keshore

What if we deliberately write the worst Python code possible? In this talk, I will intentionally write Python code full of classic anti-patterns, global state chaos, god functions, mutable default arguments, swallowed exceptions, over clever one liners, and inheritance abuse. Each example "works", until it doesn’t. Through short live demos and interactive moments, I will unpack how these bad code impacts, how Python actually behaves, and how small decisions slowly turn into fragile systems. I will refactor each example into a simpler, more Pythonic version and discuss the thinking behind the change. This is not a talk about shaming bad code. It’s a talk about recognizing human behavior in software design, developing better instincts, and learning to value boring & readable Python over clever solutions. Attendees will leave with a sharper eye for problematic code, practical refactoring heuristics, and a renewed appreciation for clarity over complexity.

1430 - 1500
Venue A - Event Hall 1-1
Beyond the Weights: Surviving Python's AI Supply Chain Minefield
Red Team Lead, Security Assurance - Bytedance, Yuli Zhan

In March 2026, attackers compromised Trivy — a security scanner trusted by thousands of CI/CD pipelines — and used it to steal PyPI publishing tokens from LiteLLM, one of the most widely deployed AI gateways in the Python ecosystem. Within hours, poisoned package versions were live on PyPI, silently harvesting cloud credentials from every Python process on infected hosts. The malicious code never needed to be imported — it ran on interpreter startup. This is the new shape of supply chain attacks in the AI era. The threat surface has expanded beyond requirements.txt typosquatting into territory most developers aren't watching: weaponised CI/CD tooling, stealthy .pth file persistence, and inference-time backdoors hidden inside GGUF chat templates that manipulate LLM outputs without touching a single weight. This talk dissects both attack classes with technical depth, then pivots to practical defence. Drawing on experience as the Global Red Team Lead at ByteDance/TikTok, I'll walk through how our detection pipeline would have caught the LiteLLM compromise, and how developers can apply the same principles with open-source tooling. Attendees will leave with a concrete, layered defence strategy they can start implementing the same week.

Venue B - Event Hall 1-2
Mastering the Modern Database Lifecycle with Navicat Premium
Navicat, Chuan Tze Bok

Managing databases shouldn't be a fragmented, time-consuming process. Whether you are designing a new schema from scratch, migrating massive datasets, or trying to make sense of complex metrics, modern database professionals need a unified, intelligent toolkit.

In this session, we will explore how Navicat Premium serves as an end-to-end workspace for the entire data lifecycle. We will walk through a live, practical workflow demonstrating how to seamlessly transition from initial database connection and table design to advanced AI-assisted querying and Business Intelligence visualization.

1515 - 1545
Venue A - Event Hall 1-1
Beyond the System Prompt: Scaling AI Capabilities with Agent Skills
Gen C Community Lead, Jiawei Lin

As LLM-powered agents move from simple chat interfaces to complex autonomous systems, developers face a recurring challenge: the "Monolithic Prompt." Cramming every tool, instruction, and piece of documentation into a system prompt leads to high latency, increased costs, and "context lost in the middle." In this session, we explore Agent Skills, an open standard and Python library designed to modularize agent capabilities. We will dive into the architecture of a "Skill"—a structured bundle of Markdown instructions, Python scripts, and reference data—and demonstrate how agents can dynamically discover and load these skills only when needed. You will also learn about how Agent Plugins for AWS equip AI coding agents with the skills to help you architect, deploy, and operate on AWS. Agent plugins for AWS are currently supported by Claude Code, Codex, and Cursor.

Venue B - Event Hall 1-2
Building a Flexible and Scalable Notification System
Backend Software Engineer - Kakobank, GyeongSeon Park

What happens when your notification job silently sends the same message twice — because someone forgot deduplication logic that gets copy-pasted into every new job?

Every notification type on my team used to require its own cron job, its own deploy, and its own prayer. I wanted a way out — and ended up building a time-driven notification system that sends 60,000–130,000 messages to ~30,000 users in about 8 minutes, using Python, PostgreSQL, and Airflow.

The key idea: notification rules live in the schema, not the codebase.

Each rule is just a row: a cron schedule, a SQL query to find "what happened", another SQL query to find "who cares", and a template ID. Arguments flow from one query to the next through string.Template substitution. Need a new notification type? Insert a row. No deploy, no PR, no waiting on engineers. Our ops team started adding their own notifications within a day.

I'll walk through the architecture, show real code, and share what broke along the way. You should be able to steal this pattern for your own system.

🔗 Article introducing this architecture - DevOps.dev Medium: https://medium.com/p/eef601f22518

1600 - 1630
Venue A - Event Hall 1-1
Adopting uv and pyproject.toml for mono-repo: Challenges and Approach.
Machine Learning Engineer - Grab, Mou Yuan Yap

Managing Python dependencies across a large ML pipeline mono-repo is notoriously painful. Every job environment was baked into an opaque Docker image maintained in a separate repository — requiring cross-repo pull requests, manual image tag updates, and zero local reproducibility. Teams had no lockfiles, no standardization, and little visibility into what was actually running in production.

This talk walks through the real-world journey of adopting pyproject.toml and uv across a large ML pipeline mono-repo. We'll cover the structural pattern that makes per-job isolation clean in a mono-repo, how a curated golden base image matrix tames CUDA and ML framework compatibility, and how a platform-hosted constraint file eliminates silent version drift after platform upgrades — all without requiring teams to coordinate manually.

You'll leave with a concrete recipe for adopting uv and pyproject.toml in your own mono-repo, an honest account of the friction points, and a clear picture of how the pieces fit together from local dev through to production CI.

Venue B - Event Hall 1-2
Democratizing Data Science: How We Unlocked Forecasting for Every Team at Grab
Principal Machine Learning Engineer - Grab, Marc Schwalbach

An initial attempt to build a single API for forecasting proved unsuccessful. This failure motivated us to pivot to building Spyce, an internal Python package that democratized forecasting and accelerated the development of forecast-backed solutions across the company.

This approach streamlined the workflow of Data Scientists and enabled teams without Data Scientists to explore forecasting solutions for their applications, reducing the POC phase from weeks to days.

Today, Spyce serves as the backbone for various critical downstream applications at scale across hundreds of cities, including anomaly detection (significantly reducing false positives) and supply & demand forecasting (improving forecasts by 35-60% for key holidays). This talk will cover the initial platformization failure, the pivot to serve our users, and the package’s role in the foundation model and agentic era.

🔗 https://marcschwalbach.com

1630 - 1730 Work on AI Agent / Hackathon | Day 1 Closing