Mahmoud Khairy Abdallah
Purdue University
I am currently a member of researcher staff at AMD Research in Santa Clara. I am a 5th year graduated with my PhD at the Department of Computer Engineering, Purdue University. I was a research assistant in
Accelerator Architecture Lab at Purdue (AALP) advised by Professor Tim Rogers. I am a big fan of programmable accelerators, especially SIMT-based accelerators and systems, like GPUs and RPUs. If you do not know what RPUs are, see my recent MICRO'22 paper!
My current research focus is to overcome the slowing growth of Moore's law by building scalable and energy-efficient hardware, compilers and systems for exa-scale computing, data center and deep learning applications.
I received my BSc and MSc in Computer Engineering from Cairo University, Egypt.
Quick Links: Google Scholar - Linkedin - GitHub - Medium - CV
Recent News
- July 2023: I am honored to see my Accel-Sim paper was selected for the prestigious retrospective of ISCA papers from the 25 years
between 1996 and 2020, only 98 out of 1,077 papers were selected! Check out Purdue University's acknowledgement article. - August 2022: I am delighted to join AMD Research team in Santa Clara!
- July 2022: My SIMR paper is accepted at MICRO 2022.
- June 2022: I passed my PhD final exam successfully! My dissertation can be found here. Slides can be found here.
- Feb-April 2022: Invited Talks at UCF, University of Rochester, AMD Research and Intel AI.
- Sept 2021: Our recent SigArch article has been recognized in the Press. See HPCWire.
- Aug 2021: SigArch blog is released to understand the ML accelerator war (an academic's view).
- July 2021: Two papers accepted at MICRO 2021.
- October 2020: I passed my PhD preliminary exam successfully!
- August 2020: I got recently interested in writing my technical blogs. My first article on Medium is about ML hardware comparison.
- July 2020: A paper accepted at MICRO 2020.
- June 2020: Accel-Sim is released.
- March 2020: A paper accepted at ISCA 2020.
Research Intersets
- Data Center Microservices & Systems: SIMR[MICRO'22]
- Multi-GPU Scaling & Compilers: LADM[MICRO'20]
- SIMT Architecture & Caches: JPDC[2019], SIMR[MICRO'22], SIMTec[ISPASS2022], GPGPU2015, TPDS2016
- Scalable and Accurate Performance/Power Modeling: Accel-Sim[ISCA'20], AccelWattch[MICRO'21], SIGMETIRCS[2018], SST_GPU
- Deep Learning Acceleration, Evaluation & Performance Analysis: Meduim[2020], SigArch[2021], PKA[MICRO'21], and more is coming
Publications
1st Author Conference Publications
(MICRO 2022) Mahmoud Khairy, Ahmad Alawneh, Aaron Barnes, and Timothy G. Rogers
SIMR: Single Instruction Multiple Request Processing for Energy-Efficient Data Center Microservices,
In The 55th IEEE/ACM International Symposium on Microarchitecture
(Acceptance rate: 86/348 = 22%)
[Paper] [Extended Paper] [Short Slides] [Full Slides] [FAQs] *Pending Patent*
(MICRO 2020) Mahmoud Khairy, Vadim Nikiforov, David Nellans, and Timothy G. Rogers
Locality-Centric Data and Threadblock Management for Massive GPUs,
In The 53rd IEEE/ACM International Symposium on Microarchitecture, Virtual Event, October 2020
(Acceptance rate: 82/422 = 18%)
[Paper] [Light Slides] [Light Talk] [Full Slides] [Full Talk]
(ISCA 2020) Mahmoud Khairy, Jason Shen, Tor M. Aamodt, and Timothy G. Rogers
Accel-Sim: An Extensible Simulation Framework for Validated GPU Modeling,
In The 47th International Symposium on Computer Architecture, Virtual Event, May 2020
(Acceptance rate: 77/421 = 18%)
[Paper] [Slides] [Video] [Release] [Retrospective]
Selected In ISCA@50 Retrospective: 1996-2020, only 98 out of 1,077 papers were selected.
2nd Author Conference Publications
(MICRO 2021) Cesar Avalos, Mahmoud Khairy, Roland N. Green, Mathias Payer, Timothy G. Rogers.
Principal Kernel Analysis: A Tractable Methodology to Simulate Scaled GPU Workloads,
In The 54th IEEE/ACM International Symposium on Microarchitecture
(Acceptance rate: 94/430 = 22%)
[Paper] [Full Slides] [Release]
(MICRO 2021) Vijay Kandiah, Scott Peverelle, Mahmoud Khairy, Amogh Manjunath, Junrui Pan, Timothy G. Rogers, Tor Aamodt, Nikos Hardavellas
AccelWattch: A Power Modeling Framework for Modern GPUs,
In The 54th IEEE/ACM International Symposium on Microarchitecture
(Acceptance rate: 94/430 = 22%)
[Paper] [Release]
(SIGMETRICS 2018) Jain Akshay*, Mahmoud Khairy*, Timothy G. Rogers, *First Coauthors
A Quantitative Evaluation of Contemporary GPU Simulation Methodology,
In The ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems , Irvine, California, June 2018
[Paper] [Poster] [Slides]
Journals
(JPDC 2019) Mahmoud Khairy, Amr G. Wassal, and Mohamed Zahran
A Survey of Architectural Approaches for Improving GPGPU Performance, Programmability and Heterogeneity,
In The Elsevier Journal of Parallel and Distributed Computing 127 (2019): 65-88, February 2019
[Paper]
(TPDS 2016) Mahmoud Khairy, Mohamed Zahran, and Amr G. Wassal,
SACAT: Streaming-Aware Conflict-Avoiding Thrashing-Resistant GPGPU Cache Management Scheme,
In The IEEE Transcation on Parallel and Distributed Systems
[Paper]
Workshops/Posters
(ISPASS 2022 Poster) Ahmad Alawneh, Mahmoud Khairy and Timothy G. Rogers
A SIMT Analyzer for Multi-Threaded CPU Applications,
In The 2022 IEEE International Symposium on Performance Analysis of Systems and Software, Singapore, May 2022
[Paper] [Poster]
(ISPASS 2019 Poster) Mahmoud Khairy, Jain Akshay, Tor M. Aamodt, and Timothy G. Rogers
A Detailed Model for Contemporary GPU Memory Systems,
In The 2019 IEEE International Symposium on Performance Analysis of Systems and Software, Wisconsin, March 2019
[Paper] [Poster]
(GPGPU 2015) Mahmoud Khairy, Mohamed Zahran, and Amr G. Wassal,
Efficient Utilization of GPGPU Cache Hierarchy,
In The 8th Workshop on General Purpose Computing using GPUs (GPGPU8), co-located with 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP), San Francisco, California 2015
[Paper] [Slides]
Technical Reports
(SNL TR) Mahmoud Khairy, Mengchi Zhang, Roland Green, Simon David Hammond, Robert J. Hoekstra, Timothy Rogers, and Clayton Hughes
SST_GPU: An Execution-Driven CUDA Kernel Scheduler and Streaming-Multiprocessor Compute Model,
SANDIA REPORT SAND2019-1967, February 2019
[Paper] [Release]
(SNL TR) Hughes, Clayton, Simon David Hammond, Mahmoud Khairy, Mengchi Zhang, Roland Green, Timothy Rogers, and Robert J. Hoekstra
Balar: A SST GPU Component for Performance Modeling and Profiling,
SANDIA REPORT SAND2019-10389, Sandia National Lab (SNL-NM), 2019
[Paper] [Release]
(arvix) Mahmoud Khairy, Jain Akshay, Tor Aamodt, Timothy G. Rogers
Exploring Modern GPU Memory System Design Challenges through Accurate Modeling,
arvix report 1810.07269, October 2018
[Paper]
Dissertations
(PhD Dissertation) Mahmoud Khairy,
Scalable and Energy-Efficient SIMT Systems for Deep Learning and Data Center Microservices, August 2022
[Thesis] [Slides]
(MSc Dissertation) Mahmoud Khairy
Efficient Utilization of GPGPU Cache Hierarchy, May 2015
Technical Blogs
(SigArch) Tim Rogers and Mahmoud Khairy
An Academic’s Attempt to Clear the Fog of the Machine Learning Accelerator War,
SigArch blog, Aug 2021
[Blog link]
(Medium) Mahmoud Khairy
TPU vs GPU vs Cerebras vs Graphcore: A Fair Comparison between ML Hardware,
Medium article, July 2020
[Article link] [Slides]
Press
HPCWire
'Purdue Researchers Peer into the ‘Fog of the Machine Learning Accelerator War’
[Article link]
ZDnet
'AI computer maker Cerebras nabs TotalEnergies SE as first energy sector customer’
[Article link]
Analytics India Magazine
'Thinking Beyond Generative AI, One Token At A Time’
[Article link]
Community Services
- ISPASS 2023 Program Committee Member and Poster Chair
- Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) Journal - External Reviewer - 2023
- Computer Architecture Letters (CAL) - External Reviewer - 2023
- Transactions on Architecture and Code Optimization (TACO) Journal - External Reviewer - 2023
- Transactions on Computer (TC) Journal - External Reviewer - 2022
Software
- Accel-Sim (Principle Contributor)
- SST_GPU (Principle Contributor)
- Accel-Wattch
- Principle Kernel Analysis for MLPerf workloads
- SIMT Analysis Tool for CPU workloads, SIMTec (To be released soon)
My CV
- Mahmoud_Khairy_CV_2022, Last Update: Dec 2022
- Research and Teaching Statements (available upon request)
Contact
- Email:
abdallm(AT)purdue.edukhairy2011(AT)gmail.com