Logo

Coding preview

Thread-Safe Streaming Percentile Engine

What this preview is

About this preview

Thread-Safe Streaming Percentile Engine is a cracked quant coding problem on concurrency in Python, asked at Quant.

Unlock full access to getcracked

Join to unlock this problem, detailed solutions, and our complete library of quant finance interview prep.

Building a thread-safe streaming percentile engine for high-frequency trading

This is a hard coding problem that combines two distinct challenges: implementing a memory-efficient approximate quantile sketch, then making it safe for concurrent producer-consumer access under extreme scale. It is representative of the systems-level work that quant trading desks demand, where both algorithmic correctness and concurrency discipline matter equally.

Part 1 tests your ability to trade off accuracy for space: you must implement a streaming percentile estimator (such as Greenwald-Khanna) that maintains a sub-linear summary of up to 108 observations while answering percentile queries in sub-linear time. Part 2 escalates this to a multi-threaded setting, requiring you to protect the sketch with a readers-writer lock that allows concurrent queries but serializes ingestion, all without races or deadlocks.

Strong solutions demonstrate fluency with the Greenwald-Khanna algorithm or an equivalent sketch, careful state management in a concurrent context, and clear reasoning about linearizability: a query must reflect some consistent snapshot of ingested values taken before it began. Interviewers probe whether you understand the difference between a global mutex and a readers-writer lock, and whether you can reason about memory and time complexity under contention.

  • Streaming quantile summaries and rank-error guarantees
  • Readers-writer locks and serialization vs. concurrency
  • Linearizability and snapshot consistency under concurrent updates
  • Space and time complexity trade-offs at scale