GPU Kernel Tile Scheduler

GPU tile scheduling under dependency constraints and resource limits

This hard coding problem tests your ability to model and solve a scheduling problem on a constrained resource (GPU streaming multiprocessors). It is the kind of system-level question that Nvidia and other GPU vendors pose to candidates working on compiler, runtime, or kernel-dispatch optimization.

The core challenge is to assign computational tiles to a fixed pool of parallel processors while respecting a directed acyclic graph of data dependencies and minimizing total wall-clock time. Unlike a simple load-balancing problem, you must account for blocking: tiles cannot start until their dependencies finish, which creates critical-path constraints that no amount of parallelism can overcome.

Strong solutions typically combine a topological simulation (processing tiles in dependency order) with a greedy or priority-based scheduling strategy. You will need to track when each SM becomes free, respect when each tile becomes ready, and decide which ready tile to assign next—often by reasoning about downstream impact or remaining workload. The problem rewards both correctness on the given examples and efficiency for large n.

Topological sort and DAG traversal
Greedy scheduling and priority queues
Critical-path analysis in task graphs
Discrete-event simulation

About this preview

Unlock full access to getcracked

GPU tile scheduling under dependency constraints and resource limits

Firms that ask questions like this

What this preview is

About this preview

Unlock full access to getcracked

GPU tile scheduling under dependency constraints and resource limits

Firms that ask questions like this

Related practice