Library
GoogleSoftware Engineer· Behavioral

Tell me about a time you had to make an important technical decision without complete information.

Senior and staff engineers at Google constantly make decisions with incomplete data — there's never enough time, never enough signal, and you're often working at a scale where small bets compound. This question probes your *process*, not whether you got the answer right. Interviewers want to see how you frame what you don't know, how you scope unknowns into something testable, and what you learned afterward. Pretending you had complete information, or hand-waving past the unknowns, both signal a junior mindset.

How to think about it

Walk through five beats: (1) the decision you faced and the deadline, (2) what you knew vs. what you didn't, (3) how you scoped the unknowns into something you could reason about, (4) what you decided and why, (5) what you learned post-hoc. Beat (3) is where the interview gets won.

Weak · sample answer

45/100

Yeah, this happens a lot actually. One time we had to pick between two databases for a new service.

No deadline or scale anchor
Without a deadline or scale, the interviewer can't tell whether this decision was a 30-minute coffee chat or a $10M architectural commitment.

We didn't have time to do a full benchmark, so I had to make a call based on what I knew.

I went with Postgres because I'd used it before and it seemed safer. The other option was a newer NoSQL thing that the team was excited about but we didn't really know how it would scale.

Decision driven by familiarity, not analysis
'I'd used it before' is a tiebreaker, not a reason. The interviewer wants to see you reason about the unknowns, not avoid them.

It turned out fine. Postgres handled our load without issues.

Why this scores weak

The answer names a real decision but never engages with what was unknown or how the candidate reasoned about it. The decision is justified by familiarity ('I'd used it before') rather than analysis, and the outcome is summarized as 'fine' without any data.

Key takeaways

  • Anchor the decision in a deadline ('we had to ship by X') and a scale ('Y QPS, Z TB').
  • Make the unknowns explicit — 'we didn't know how Cassandra would behave at our write pattern' beats 'we didn't have time to benchmark'.
  • Replace 'I'd used it before' with the actual property you were optimizing for (operational simplicity, team familiarity, recovery model).

Average · sample answer

72/100

Last year we were building a new ingestion pipeline for clickstream data — projected at around 50k events per second at launch with growth to maybe 200k within a year. We had about three weeks before a hard launch deadline tied to a partner integration.

The main decision was the queueing layer between our edge collectors and the warehouse loaders. We had two viable options: Kafka, which the rest of the org already ran, or a managed service we'd been evaluating that would shave 2-3 weeks of ops work but charged on throughput.

Clear options with tradeoffs
Naming both options and the dimension they trade off on is the right first move.

We didn't have time to load-test either option at the projected 200k/sec. So I had to decide without knowing how either would behave at peak.

States the unknown but doesn't scope it
Good that you flagged the unknown. The next step — and the one this answer skips — is showing how you turned that unknown into something testable.

I went with Kafka. The reasoning was that the ops cost of debugging an unknown managed service at peak would be higher than the savings during the build phase, and the rest of the org's experience meant we'd have help if something went wrong.

We hit launch and the pipeline handled the load fine. About four months in we crossed 150k/sec and started seeing some rebalancing issues, which we'd anticipated.

Why this scores average

Strong Situation and a defensible decision, but the answer skates past the most interesting part: how you reasoned about the unknowns. The candidate identifies that they couldn't load-test at peak but doesn't show what they did instead (smaller-scale tests? back-of-envelope math? talking to teams already at that scale?).

Key takeaways

  • When you identify an unknown, show one or two ways you reduced it: smaller-scale benchmarks, conversations with teams already operating at that scale, capacity-planning math.
  • Surface the *tradeoff dimension* explicitly: were you optimizing for ops cost, team risk, vendor lock-in, recovery model?
  • End with what you learned — even a sentence ('in hindsight, the managed service might have been fine, but we got more out of the team familiarity than I expected') turns the answer from descriptive into reflective.

Strong · sample answer

89/100

Last year we were building a clickstream ingestion pipeline — 50k events/sec at launch, projected 200k within a year, with a hard launch deadline tied to a partner integration three weeks out.

Scale + deadline up front
Two numbers in one sentence sets the stakes.

The decision: queueing layer between edge collectors and warehouse loaders. Two viable options — Kafka, which the rest of the org runs, or a managed service that would save us 2-3 weeks of ops work but priced on throughput. The unknown was peak behavior: we had no time to load-test either at 200k/sec, and neither vendor had a public benchmark at our exact write pattern.

Rather than guess, I scoped the unknown into three things I could actually answer in a week: (1) does Kafka rebalance cleanly at our partition count when we hit 100k/sec — testable at half-scale in our staging cluster; (2) does the managed service handle our specific write pattern — I emailed the vendor's SE and got a customer reference at 80k/sec with a similar pattern; (3) what's the total cost-of-ownership over 18 months at 200k/sec — back-of-envelope math, but enough to bracket it.

Scoping the unknown into testable pieces
This is the beat that wins the interview. You can't always reduce uncertainty to zero, but you can decompose it into pieces that *are* answerable in your timeframe.

What I learned from those three: Kafka rebalanced fine at half-scale, the managed service's reference customer was very happy but acknowledged occasional throttling at peak, and the 18-month TCO came out roughly equal — managed service slightly cheaper at low scale, Kafka slightly cheaper at high scale.

I made the call to go with Kafka, for reasons that weren't the ones I would've named at the start. The cost wasn't the deciding factor — it was the throttling note from the reference customer combined with our partner SLA. If our pipeline throttled during a launch event, we'd be paying for an incident we couldn't easily debug at the vendor layer. With Kafka, debugging would be painful but possible.

Decision pivots on the real tradeoff dimension
Notice how the decision is driven by debuggability + SLA risk, not cost or familiarity. That's the kind of nuance interviewers are listening for.

We launched on time and held 50k/sec without issues. At month four we hit 150k and started seeing the rebalancing edge cases we'd predicted at half-scale, which we'd already drafted a runbook for. The big thing I learned: I'd been about to pick Kafka for the wrong reason ('we already run it'), and forcing myself to decompose the unknowns made me realize the actual decision driver was debuggability during incidents. I've used that decomposition pattern on every major call since.

Reflective close changes the gear of the answer
Naming the wrong reason you almost picked, and the meta-lesson you took forward, signals seniority. This is the senior-engineer version of 'what I learned'.

Why this scores strong

Hits all five beats and the third one — decomposing the unknown into testable pieces — is doing the heavy lifting. The decision pivots on a non-obvious dimension (debuggability under SLA, not cost or familiarity), and the reflective close names the wrong-reason-the-candidate-almost-picked, which is what staff-level reflection sounds like.

Key takeaways

  • When facing a big unknown, decompose it into 2-4 things you *can* answer in your timeframe — staging tests at half-scale, vendor reference calls, back-of-envelope TCO.
  • Be willing to say the decision pivoted on a dimension you didn't expect (debuggability, vendor SLA, recovery model).
  • Reflective close should name the *wrong* reason you almost picked, not just what you got right.