2025-09-03: Summer Internship with Microsoft

This summer, I had the privilege of interning with the Microsoft AI organization at Microsoft, located in Redmond, Washington, USA. My internship was a 12-week program that started on May 19th, 2025. During this internship, I worked as a data scientist intern under the supervision of Yatish Singh. Throughout my internship, I attended weekly meetings with the entire Microsoft Bing team. The weekly meetings were to update my progress, obtain feedback, resolve issues, or improve the solution. I had one-on-one meetings with my mentor, Tom Potier, twice weekly to discuss my progress and any issues I faced. I also met with my manager, Yatish Singh, at least twice a week.

My team was responsible for the allocation of cloud resources, such as servers, virtual machines (VMs), and operating systems (OS), to customers and partners across several regions globally. At the time of my internship, the allocation of resources was done manually, so this usually caused a delay of weeks in the resource allocation. The manual process is also prone to error. This blog post will dive deep into the project goal, solutions, and achievements of my internship project.

Project Goal

My goal was to automate the manual resource allocation process and to optimize the allocated resources in such a way that we have fewer unused resources.

To tackle the problem, I developed a two-stage solution consisting of a mathematical model and an LLM-based multi-agent system to automatically design and optimize a cluster configuration of resources to allocate to customers.

The key components were:

Mathematical model for cluster-types generation: This model allowed us to automatically design the cluster configurations by first determining the number of clusters needed to satisfy the customers' demands, then deciding the pools of resources within each cluster.

LLM-based multi-agent system for cluster-types optimization: This framework was introduced to improve the configurations obtained by our mathematical model.

Figure 1. Cluster Type Generation and Optimization

One of the most exciting parts of my project was creating the LLM-based multi-agent system for optimizing generated cluster-type configurations. This section will explain the challenges, design, and impact of my solution to my team.

Designing the LLM-based Multi-agent System for cluster type optimization

My mathematical model was able to solve the manual process of generating the cluster-type configurations for my team. However, the generated configurations leave many resources unused. To address this challenge, I developed a multi-agent system consisting of three agents: an optimization agent, a quality assurance agent, and a project management agent to optimize the generated configurations. Below is a step-by-step explanation of how my solution works:

User Input: Customers across several regions submit their resource demands (servers, VMs, OS) to our platform. Then, we automatically generate JSON files consisting of each customer's demand and available resources.

Mathematical Model: The mathematical model receives this input file and generates a configuration file in less than 1 minute, consisting of all the cluster-types to deploy to the customers based on their demands and available resources.

LLM-based Multi-agent System: A system consisting of three agents described below.

Optimization agent: This agent receives the generated configuration and checks if it satisfies the customers' demands, and provisioned resources do not exceed available resources. Then, it checks how many resources are left unused (goal of less than1000 IPs). If the unused IPs are more than 1000, it modifies the configurations for pools with a smaller VM size, provided there is a larger size available.
Quality assurance agent: This agent analyzes the modified configurations by the optimization agent and provides a detailed analysis of the quality of the configurations, areas for improvement, and specific steps for improvement.
Project management agent (decision agent): This agent decides whether the loop can be terminated or the configurations should be passed back to the optimization agent for improvement.

By introducing this two-stage solution, I ensured that the LLM agents do not start from scratch, which prevents hallucination.

Key Accomplishments

By the end of my internship, I had successfully developed a two-stage solution that delivers cluster-type configurations to satisfy customer demands across 18 regions. My solution reduces the team's development hours from 3-5 days to less than 1 hour. It also eliminates generation errors by 100%. I also worked with the team to deploy my solution to the cloud to serve customers globally.

Conclusion

In conclusion, my implemented solution consists of a mathematical model to generate cluster-types configurations and an LLM-based multi-agent system for optimizing the generated configurations. Through this project, I gained a deep understanding of developing autonomous LLM agents in real-world applications while contributing to the advancement of Microsoft AI's performance.

This is my fifth internship program in the United States, following my internship at Amazon in the summer of 2024, where I focused on automating BDD testing, developing a Multi-turn LLM-based Transcript Generator to streamline the testing of Alexa Banyan skills, developing Bedrock Agents, and integrating them with an enhanced knowledge base for more accurate feature file generation. The experience I gained in AI-driven automation and LLMs, along with the opportunity to improve my communication of technical concepts to diverse audiences, is invaluable for my future career.

Acknowledgments

I would like to express my gratitude to my PhD advisor, Dr. Jian Wu, for his boundless support and encouragement towards getting this internship, to my internship manager, Yatish Singh, and mentor, Tom Potier, for guiding me throughout my internship by providing feedback and suggestions. I am thankful for the opportunity to work as a data scientist intern with the Microsoft AI organization!

Kehinde Ajayi (@KennyAj)

Search This Blog

Web Science and Digital Libraries Research Group