Twitter
Advertisement

Surendar Rama Sitaraman: Driving the Next Generation of AI and Intelligent Systems

Surendar Rama Sitaraman drives AI and intelligent systems by optimizing compilers and frameworks, enabling efficient, scalable AI deployment across diverse hardware for real-world impact.

Latest News
Surendar Rama Sitaraman: Driving the Next Generation of AI and Intelligent Systems

Recognized for linking research and applied engineering, Surendar has delivered breakthrough solutions in compiler technology, graphics optimization, and large-scale AI frameworks.

DNA India had the opportunity to meet with Surendar Rama Sitaraman in an exclusive interview and learns about his remarkable journey from enterprise systems to cutting-edge AI compiler engineering. An alumnus of the University of Southern California and a contributor to worldwide tech firms, Surendar has also established a record of significant research that continues to shape the discipline. He reflects on some of the innovations that are transforming the landscape of next-generation intelligent systems.

Q. Your career spans enterprise systems, graphics engineering, and AI frameworks. How did that happen?

A: My journey started in India as a software engineer in a multinational company, where I was building enterprise software for Fortune 500 clients. That gave me decent exposure to secure coding, UI design, and designing large systems.

Pursuing Master's in Computer Science at the University of Southern California in Los Angeles was a dream come true for me. The program's research-based curriculum shaped my technical approach and led to my first U.S. internship, which exposed me to system-level performance tuning and taught me the love for solving challenging technical problems.

From there, I moved into graphics software engineering, doing DirectX 12 driver work and then performance optimization of Vulkan graphics drivers. Those paved the way for my current work - creating AI compilers and frameworks, where the challenges are even more exhilarating.

From AI compilers to enterprise systems, the interest has always been the same - building performance-tuned infrastructure that is user-friendly, scales well, and gets out of the way of a good user experience.

If a system is transparent to the user, then the engineering behind it was done right," asserts Surendar Rama Sitaraman.

Q. You're working on AI frameworks and compiler design right now. What does that involve?

A: I work on enabling AI models to run efficiently on heterogeneous compute platforms like CPUs, GPUs and purpose-built AI accelerators. This requires optimizing inference pipelines so that models built with popular frameworks such as ONNX that can deploy seamlessly using tools like OpenVINO.

My responsibilities are compiler backend porting, optimizing low-precision inference quantization and using operator level optimizations that reduce memory latency and improve throughput. The goal is to speed up AI workloads, make them efficient and deployment friendly to use in production.

Q. One of your main contributions has been porting an AI compiler stack onto LLVM 20. Why was that significant?

A: It was an initial upgrade that brought the compiler up to date and scalable. The upgrade from LLVM 19 to 20 introduced architectural changes that required reinvention of fundamental logic. I migrated key changes, streamlined verification processes, and established integrated build flags -enhancing maintainability and performance for subsequent AI frameworks.

Surendar Rama Sitaraman reiterated the central role of compilers in AI infrastructure as well when he said, "Compiler technology is the backbone of efficient AI - it decides how close we can get to hardware limits."

Q. You have also employed advanced AI operations such as ArgMax. How are these the keys to success?

A: ArgMax operation may sound simple, but performing it efficiently on hardware-optimized hardware requires the precise coordination of computation and memory resources. I developed an approach that redefines ArgMax as a compiler-level TopK operation that leverages hardware-optimized execution paths. It comes with significantly reduced latency for inference loads.

Q. Memory transfer and synchronization optimizations are jargon-heavy. Why are they so important?

A: For AI inference, there are no microsecond to spare. Unnecessary memory copies or badly optimized synchronization can drain power and slow down runtimes. By removing unnecessary DMA operations and optimizing barrier handling, we made execution plans leaner. That makes a huge impact for real-time workloads such as on-device vision or voice AI.

Q. You spent a lot of time earlier in your career working on graphics systems. How does that background shape what you're doing in AI today?

A: Debugging graphics drivers showed me the value of systematically approaching large multi-layered systems. Graphics pipelines share a great deal with AI runtimes - dataflow, memory hierarchy and latency bottlenecks. That was a solid foundation in system-level reasoning that now serves me for compiler and runtime optimization.

Q. Your research portfolio is considerable - how does it feed into your engineering work?

A: Research keeps me up-to-date with emerging ideas and shifting trends. I have authored numerous technical articles and am a peer reviewer for top journals and conferences, including Web of Science indexed journals. Serving as a guest editor of a special issue on machine learning in healthcare and invited speaker at technical conferences allow me to contribute to the broader debate in the discipline. To me, research and engineering are two sides of the same coin - one learns from the other, therefore practical solutions stem from good knowledge.

Q. Your research contributions span multiple domains. Can you share what areas you focus on and why they matter?

A: My research is mainly focused on advanced AI designs, smart automation architectures, and big software systems for practical use. That needs optimizing neural networks, federated learning, IoT intelligent systems, and privacy preserving AI designs. I have examined how edge computing, big data analytics, and adaptive deep learning models can be integrated into real-world solutions. These solutions permit everything from intelligent infrastructure and autonomous systems to secure IoT ecosystems and high-performance computing platforms. Some of my research also investigates bio-inspired optimization methods for making AI more efficient, as well as AI frameworks for robotics, cloud environments and industrial automation.

Surendar echoes on the broader vision for his research on AI: "AI's full potential comes from building systems that are adaptable, reliable, and scale-ready."

Q. Beyond specific applications, what is the broader impact of your work on the field of AI?

A: The essence of my work is making AI and sustainability accessible. By enabling AI models to run faster and better on more hardware, we reduce barriers for developers and organizations. A great idea from a startup doesn't need a huge data center to build a successful AI product. It also allows AI to run on smaller, lower-power devices at the edge, such as smart cameras or healthcare sensors, which is necessary for privacy and immediate responses.

Ultimately, this work brings AI closer to the people, in terms of cost and capability, and enables it to address a wider category of realistic problems, from personalized medicine to climate simulation.

Q. What do you see as the next significant challenge for AI and intelligent systems?

A: I see a huge challenge in the creation of fully adaptive intelligent systems that can scale between different platforms effortlessly, from edge devices to huge cloud environments. We're moving towards a future where hardware and software are co-designed for unparalleled performance and efficiency. Compilers will continue to be crucial in handling this complexity, but the future goes way beyond compilation.
We will see standalone AI ecosystems that integrate optimization, security, and real-time responsiveness at every level, from power-efficient devices in smart infrastructure to high-performance computing for scientific simulations.

Sustainability is the other prominent area. With growing models and complexity in AI models, it is crucial to reduce energy consumption without compromising on performance. My aspiration is to make technologies that accelerate AI, make it scalable, responsible and sustainable.

Q. What drives you to keep solving these complex challenges?

A: Honestly, it’s about making technology disappear in the best way. I love when something works so smoothly that people don’t even notice it’s there. That’s the sweet spot. And when that happens, you know the engineering is solid. That’s what keeps me coming back every single day.

Closing Statement:
From enterprise apps to graphics pipelines and now AI compilers, Surendar’s journey has one theme, versatility with depth. His work keeps redefining the software backbone of intelligent computing, one optimization at a time.

Find your daily dose of All Latest News including Sports NewsEntertainment NewsLifestyle News, explainers & more. Stay updated, Stay informed- Follow DNA on WhatsApp.
    Read More
    Advertisement
    Advertisement
    Advertisement