I work on AI/ML at AMD. Previously, I was with Qualcomm AI Research, and I worked in several interesting and interrelated areas. My primary goal is to make computer vision/machine learning workloads efficient either by designing hardware accelerators or making the models more efficient. I also teach at University of California, San Diego for Master of Advanced Studies Program. You can find more about Master of Advanced Studies program here: https://jacobsschool.ucsd.edu/mas/faculty-staff

Currently I work on following areas:

Full stack machine learning: HW-SW co-design for low power machine learning (full stack optimization of ML from model to hardware level. Full stack optimization: Quantization, model level optimization, graph level optimization, hardware level optimization)
Efficient large language models (mostly focused on the state space models and hybrid self-attention)
Emerging AI workloads (LLMs, 3D vision, 3D sparse convolution) on edge devices.
Collaborated extensively with Universities (UCSD Visual Computing Center, Cornell and many other universities. You can find some of the projects as a result of these collaborations from here.

In the past, I worked on ML compilers (mainly TVM-based), and I also briefly worked on MLIR.

Previous work

Previously, I was a Principal Software engineer at the 3D Engineering department of Cognex Corporation, and I have worked on following areas:

FPGA acceleration of computer vision algorithms
End to end design of binary neural network on the FPGA

I contributed to the high speed DSMAX product line of Cognex which can be found here: DSMAX 3D LASER DISPLACEMENT SENSOR

Parallel Programming for FPGAs Book

Parallel Programming for FPGAs which I contributed (developed most of labs and demos) is used by many Universities around the world starting from UCSD, Cornell, UT Austin, UC Berkeley,..and many others.
This book has nice tutorial/demo page here: Parallel Programming for FPGAs,

Academic life: I completed my PhD in Computer Science from the Computer Science and Engineering department of UCSD. I was a Masters’s student at ICU (Information and Communications University) which is now part of Korea Advanced Institute of Science and Technology. I obtained my Bachelor of Science in Computer Science from the Mongolian University of Science and Technology. Other than my current job, I occasionally teach at the Computer Science and Engineering department of the University of California San Diego as a lecturer and an adjunct professor. A long time ago I was in South Korea where I worked for ETRI as a researcher for a couple of years.

My general interests are applications of hw acceleration of ML (LLMs, Transformers, 3D, deep earning, ML), tinyML, ML-based ISP, and computer vision, deep learning compilers.
I am also interested in computer architecture, parallel programming and ML compilers