2025-12-28: IEEE High-Performance Computing 2025 Trip Report
This trip can be broken down into two main parts: a short sightseeing visit to Delhi and Agra, followed by the conference in Hyderabad.
Delhi and Agra
The trip started with a couple of days in New Delhi. Since I had not been to this part of the world before, I wanted to take the opportunity to explore the city before the conference. Delhi is enormous, both geographically and culturally. Over the first two days of the trip, I ended up walking more than 50 kilometers.During this time, I visited several UNESCO World Heritage sites, including the Taj Mahal, Agra Fort, and landmarks within Delhi such as Qutub Minar, which is the world's tallest brick minaret. While in Delhi, I also met up with a few friends from Fermilab. Although we didn't do any sightseeing together, we did manage to go out for dinner one evening, which was a nice break from traveling and a fun way to catch up before the conference started. Starting the trip with sightseeing was a great contrast to the dense technical program that followed.
HiPC 2025
I arrived in Hyderabad on December 17, one day before the main technical program began. HiPC 2025 turned out to be the largest conference I have attended so far, both in terms of attendance and breadth of topics covered, spanning high-performance computing, AI systems, and quantum computing.
Day 1: December 18, 2025
The main conference started on December 18. One notable statistic that stood out in the opening session was that only 29% of all submitted papers were accepted to the main proceedings. I was fortunate that Zeus was part of that small fraction.
The Day 1 keynote was given by Dr. Pratyush Kumar from Sarvam AI. His talk focused on what it actually takes to train large language models from scratch. He walked through the challenges of setting up compute and data infrastructure and shared lessons learned while building LLMs in practice.One part I found especially interesting was his discussion of real-world applications, including examples where language models helped with educational videos with real-time audio in multiple languages, while keeping the same voice as the original speaker. Overall, the keynote gave a very practical view of LLM development, beyond just model architectures.
The rest of the day featured workshops and technical sessions covering HPC systems, AI, and education.
Day 2: December 19th, 2025
- Particle Swarm Optimization to perform an initial global search, identify promising regions of the search space,
- BFGS, a quasi-Newton method, for fast local convergence,
- Automatic Differentiation (AD) to compute gradients accurately without requiring users to manually derive them,
- and massively parallel GPU execution, where hundreds or thousands of independent optimizations run concurrently.
The algorithm operates in two phases. First, a small number of PSO iterations are used to improve the quality of the starting points. In the second phase, each particle independently invokes a BFGS optimization on the GPU, using forward-mode AD to compute gradients efficiently. Once sufficient convergence is reached, all the threads synchronize and terminate to stop early using atomic operations.
By running many independent optimizations in parallel on GPUs, Zeus achieves 10x--100x speedups over a perfectly parallel CPU implementation while also improving accuracy compared to existing GPU-based methods. One of the advantages of the parallel algorithm is that it is less sensitive to poor starting points, whereas for the sequential version, we must repeatedly restart until sufficient convergence is achieved.
In the talk, I also discussed experimental results from both synthetic benchmark functions, such as the Rastrigin and Rosenbrock, and a real-world high-energy physics application. The example plot shows simulated data from proton-proton collisions at the Large Hadron Collider. When protons collide, their quarks and gluons produce sprays of particles called jets. When two jets are produced, their invariant mass can be reconstructed and fitted by minimizing a negative log likelihood. The pull distribution measures how far each data point is from the fit, in units of its expected uncertainty. A good fit should have pulls fluctuating around zero and mostly within ±2σ. This shows agreement between the simulated data and the model prediction. I also touched on current limitations, such as handling objectives with discontinuous derivatives, and outlined future work, including deeper levels of parallelism and improved stopping criteria.
Presenting this work felt especially meaningful because it tied together my internship experience at Fermilab and my growing interest in high-performance computing. It was rewarding to share our ideas with the community and see how the broader themes of the conference connected directly with our contribution.
Day 3: December 20th, 2025
The third and final day focused heavily on AI/ML topics, along with a very interesting keynote speaker, and concluded with a quantum computing workshop.
The Day 3 keynote was given by Dr. Christos Kozryakis from Stanford University and NVIDIA Research. His talk focused on how AI workloads are shaping modern datacenter design. He argued that current AI systems often follow a supercomputing-style approach, which may not be the best fit as models continue to scale.
Instead, he made a case for scale-out AI systems, where efficiency and system-level design play a bigger role. One idea that stayed with me was his discussion of power and energy efficiency, especially the question of how much AI can realistically fit within a gigawatt of power.
Later in the day, I attended the Quantum Computing Workshop, which was one of the highlights of the conference for me. This workshop was particularly exciting for me, as I will be taking the Quantum Computing course in Spring 2026, and I am interested in exploring how Zeus could be mapped into a hybrid classical-quantum optimization algorithm.
To close the workshop, a speaker from Fujitsu presented the current state of their quantum research, including ambitious plans toward a 1000-qubit machine. After the workshop, I had several valuable discussions with experts in the field. In particular, Dr. Anirban Pathak provided initial guidance on how my current algorithm could be adapted toward a hybrid classical-quantum approach.
Additionally, Aravind Ratnam pointed me to Q-CTRL's learning tutorials, which he recommended as an excellent hands-on resource for building a stronger foundation in quantum computing.
To close the conference, I attended the banquet, which featured a cultural program and an Indian Dinner at the rooftop restaurant.
Closing Thoughts
As only my second conference, HiPC 2025 was both intense and deeply rewarding. Compared to my first conference, I felt noticeably more confident presenting my work, asking questions, and engaging with researchers across different fields. At the same time, the experience reinforced a familiar lesson that conferences are just as much about people and conversations as they are about papers and talks.
I am grateful for the opportunity to present this work, for the feedback I received, and for the many discussions that will shape my future research directions. HiPC 2025 was an unforgettable experience, and I hope to return again.
~Dominik Soós (@DomSoos)




Comments
Post a Comment