Logo

Coding preview

Shared Memory Bank Conflict Counter

What this preview is

About this preview

Shared Memory Bank Conflict Counter is a medium quant coding problem on computer architecture in Python, asked at Nvidia.

Unlock full access to getcracked

Join to unlock this problem, detailed solutions, and our complete library of quant finance interview prep.

Understanding GPU shared memory bank conflicts in NVIDIA architectures

This is a medium-difficulty coding problem that tests your understanding of how NVIDIA GPUs manage on-chip shared memory and the performance penalties that arise from hardware contention. It appears frequently in GPU architecture and systems interviews at companies like NVIDIA.

The core challenge is to model the bank conflict resolution logic correctly. Shared memory is partitioned into 32 independent banks, and when a warp (a batch of 32 threads) executes a memory operation, each thread accesses one address. The hardware can service one access per bank per cycle, but if multiple threads target the same bank at different addresses, those accesses must be serialized. The key insight is that broadcasts—where multiple threads read the same address—do not create conflicts. Your solution must group accesses by bank, identify distinct addresses within each bank (treating broadcasts as a single access), and compute the maximum serialization depth across all banks.

  • Modular arithmetic and address-to-bank mapping
  • Grouping and deduplication logic
  • Identifying the bottleneck (maximum contention across all banks)
  • Accumulating penalties across multiple memory operations