Practice real data science interviews tuned by the real interviewers who actually run them.

Real case study interviews from

Meta
Google
Uber
Spotify
TikTok
DoorDash
Netflix
81 cases

Diagnosing Metric Changes

Investigate drops, decompose metrics, find root causes

MetaProduct SenseMedium~20 min
News Feed Ranking Decline

Facebook's News Feed team shipped a ranking algorithm change last week. Since then, comments per user are down 12% but time spent is up 5%. The VP wants to know: is this a problem?

Metric decompositionTradeoff reasoningRoot cause analysis
Start session →
MetaProduct SenseHard~25 min
Facebook Likes Removal

Facebook is considering removing visible like counts from posts. Instagram already tested hiding likes in several markets. You need to think through how to measure the full impact of this change across users, creators, and the platform ecosystem.

Impact measurementSegment thinkingBehavioral analysisEcosystem reasoning
Start session →
MetaProduct SenseHard~25 min
Friend Content in Newsfeed

The Facebook Newsfeed team is trying to understand whether users see enough friend content in their feed. The balance between friend posts, public content, and recommended content is critical to Facebook's value proposition. You need to figure out how to measure and optimize this.

Metric definitionContent balance optimizationSegment analysisExperiment design
Start session →
MetaProduct SenseHard~25 min
Average Reels Watched Drop

The Reels team has flagged an alarming trend: 'average reels watched per session' has dropped precipitously over the past week. The VP wants answers by end of day. You need to diagnose what's happening and whether it's actually a problem.

Root cause analysisMetric decompositionData quality awarenessUrgency management
Start session →
AmazonProduct SenseHard~30 min
Returns Investigation

Amazon's return rate jumped 15% week-over-week. The VP of Customer Experience wants answers by end of day. Walk me through your investigation.

Metric investigationCustomer thinkingSegmentationRoot cause analysis
Start session →
GoogleProduct SenseHard~30 min
YouTube Watch Time Drop

YouTube's total watch time dropped 8% week-over-week. The VP of Product wants a root cause analysis before the quarterly business review. Multiple platform, regional, and content-type dimensions may be driving the decline, and early data suggests the drop is concentrated on mobile.

Metric decomposition across multiple dimensionsPlatform-specific hypothesis generationDistinguishing product vs. external causesPrioritized investigation framework
Start session →
GoogleProduct SenseHard~30 min
Google Play Privacy vs Engagement

Google Play's privacy perception score improved 8% after launching prominent privacy labels, but Play Store engagement (app installs, browse time, return visits) dropped 14% in the same period. The VP of Trust & Safety wants to understand whether the privacy initiative is hurting engagement or if this is coincidental.

Disentangling correlated metricsCausal vs. coincidental analysisUser segment-level diagnosisPrivacy-engagement tradeoff reasoning
Start session →
GoogleProduct SenseHard~30 min
Google Search Simpson's Paradox

Average searches per user declined 4% globally, but average searches per user increased in every individual country. The CEO asks why the global number is going down if every country is improving. You need to diagnose and explain this paradox clearly enough for an executive audience.

Simpson's Paradox recognitionCompositional shift analysisExecutive-level metric communicationGrowth-stage metric framing
Start session →
GoogleProduct SenseMedium~25 min
Google Maps DAU Drop

Google Maps DAU dropped 10% over the past 2 weeks. The VP of Geo wants an explanation before the exec review. The drop could stem from a recent iOS update, a product change, an external factor, or a data quality issue. You need to systematically rule out hypotheses.

Systematic hypothesis eliminationPlatform-specific diagnosisData quality vs. product vs. external cause distinctionStructured communication under time pressure
Start session →
UberProduct SenseHard~30 min
Rider Cancellation Spike Investigation

Rider cancellation rate in San Francisco jumped from 8% to 14% over the last week. Your team flagged this. Walk me through your investigation.

Two-sided marketplace reasoningSupply-demand decompositionRoot cause analysisFeedback loop identification
Start session →
TikTokProduct SenseHard~30 min
Algorithm Change — Creator Impact

TikTok's recommendation team optimized for watch-through rate instead of engagement rate. Watch time is up 5%, but creator follower growth is down 8%. The creator partnerships team is concerned. Investigate.

Root cause analysisEcosystem tradeoffExperiment designCreator economics
Start session →
CoinbaseProduct SenseHard~30 min
Trading Volume Decline Investigation

Coinbase's total trading volume dropped 30% month-over-month. The CEO wants to understand if this is a market problem or a product problem. Walk through your investigation.

Metric investigationMarket vs. product decompositionSegmentationRoot cause analysis
Start session →
DoorDashProduct SenseHard~30 min
Cold Food Crisis

Cold food complaints have spiked 34% over the past 3 weeks across 12 metro areas. The VP of Customer Experience wants a root cause analysis. Two recent operational changes — batched delivery and prep time inflation — may be interacting in unexpected ways.

Root cause analysis with interaction effectsMultiplicative vs. additive decompositionTimeline-to-product-change mappingBusiness trade-off reasoning
Start session →
DoorDashProduct SenseMedium~20 min
Delivery Time Investigation

Average delivery time increased by 4.2 minutes last month across the platform. No major product changes were shipped. The operations team is alarmed and wants to know what's driving the increase and whether it requires immediate action.

Metric decompositionPrioritized investigationSupply/demand diagnosis
Start session →
DoorDashProduct SenseMedium~20 min
Regional Cancellation Spike

Consumer order cancellations are up 28% in specific geographic markets over the past two weeks. The pattern is concentrated in suburban areas during dinner rush hours. You need to diagnose the root cause and recommend immediate actions.

Root cause analysisSupply vs. demand diagnosisGeographic analysis
Start session →
SpotifyProduct SenseHard~30 min
Spotify Installs Dropped 25%

Spotify's app installs dropped 25% month-over-month across all platforms. The CMO flagged this in the weekly business review and wants a root cause analysis within 48 hours. Multiple factors could be at play — marketing spend changes, competitive moves, platform-level shifts, and seasonality. You need to structure the investigation, identify the most likely causes, and size their impact.

Multi-dimensional metric diagnosisSegmentation by platform, geography, and channelCompetitive impact sizingSeparating internal vs. external causes
Start session →
SpotifyProduct SenseHard~35 min
Anomalous Account Behavior Detection

Spotify's Trust & Safety team has flagged a spike in anomalous account behavior — accounts with listening patterns consistent with credential sharing, bot-driven stream farming, or compromised accounts. These accounts inflate streaming metrics, distort royalty payouts, and degrade recommendation quality. You need to design signals for detecting different types of anomalous behavior, set appropriate thresholds, reason about false positive tolerance, and define success metrics for a fraud detection system.

Signal design for anomaly detectionThreshold setting and false positive tolerance reasoningMulti-type fraud taxonomyMeasuring success of a detection system
Start session →
NetflixProduct SenseHard~30 min
Engagement Metric Dropped 20%

Netflix's primary engagement metric has dropped 20% week-over-week. Leadership is alarmed and wants a root cause analysis within 48 hours. The drop could be a data issue, a seasonal effect, a product change, a content gap, or a genuine shift in user behavior. You need to structure a systematic investigation and determine what's real vs. artifactual.

Systematic root cause analysis for metric dropsRuling out technical and methodological artifactsSegmenting a metric drop to isolate the causeSeparating correlation from causation in metric diagnosis
Start session →

Measuring Feature Success

Define metrics, track impact, identify issues

MetaProduct SenseHard~30 min
Facebook Events Notification

The PM responsible for Facebook Events has a new idea to drive engagement. When your friends mark that they'll attend an event, you'll receive a notification. How would you measure the success of this notification?

Metric definitionGuardrail thinkingDiagnosisExperiment design
Start session →
MetaProduct SenseMedium~20 min
Marketplace Listing Notification

Facebook is creating a notification that alerts users when their Marketplace listing is about to expire. Your PM wants to measure the quality and effectiveness of this notification before a broader rollout.

Metric definitionNotification quality measurementTradeoff reasoning
Start session →
MetaProduct SenseMedium~20 min
Facebook Restaurants Feature

Facebook is launching a new product called Facebook Restaurants — a Marketplace-like experience for discovering, reviewing, and ordering from local restaurants. You need to define the target users, success metrics, and measurement plan.

Product definitionNorth Star metric identificationCannibalization analysis
Start session →
MetaProduct SenseMedium~20 min
Group Chat Feature Launch

Facebook is launching a group chat feature similar to Discord channels within Facebook Groups. The goal is to increase real-time engagement within communities. You need to define success and guardrail metrics for the launch.

Launch measurementGuardrail definitionRollout planning
Start session →
MetaProduct SenseHard~25 min
Feature Impact on User Happiness

Meta's leadership has been pushing teams to optimize for user wellbeing, not just engagement. You've been asked to develop a framework for measuring whether a feature actually makes users happier — a much harder question than whether it drives engagement.

Measurement methodologyProxy metricsMixed methodsDistinguishing engagement from wellbeing
Start session →
MetaProduct SenseHard~25 min
Instagram Reels Success Metrics

Instagram's Reels product has been live for over a year. Leadership wants a comprehensive health check: is Reels actually succeeding, or is it just cannibalizing other Instagram surfaces? You need to define the metric framework for evaluating Reels as a product line.

Metric framework designIncrementality measurementCreator ecosystemCompetitive analysis
Start session →
AmazonProduct SenseHard~30 min
Prime Exclusive Discounts

Amazon is launching 10% exclusive discounts for Prime members on select products. The goal is to increase Prime conversion and retention. How would you measure success?

Metric definitionPrime economicsTradeoff reasoningCannibalization awareness
Start session →
GoogleProduct SenseMedium~25 min
Google Docs Metrics From Scratch

Google Docs has never had a formal metrics framework. The new PM is tasked with designing one from scratch. The challenge: DAU is growing 6% QoQ, but docs created per user is declining 8%. Collaboration features are underutilized, and it's unclear whether Docs is winning on quality or just inertia. Design the top 5 metrics and identify the North Star.

Metric framework design from first principlesNorth Star metric selection and justificationCollaboration quality measurementLeading vs. lagging indicator reasoning
Start session →
UberProduct SenseHard~30 min
Uber One Subscription Measurement

Uber One is a $9.99/month subscription bundling Mobility ride discounts and Delivery free delivery fees. How would you measure its success?

Multi-product metric definitionCannibalization analysisLTV reasoningSegmentation strategy
Start session →
TikTokProduct SenseHard~30 min
For You Feed Diversification

TikTok's recommendation team wants to show more content from emerging creators in the For You feed. Currently, the top 5% of creators generate 60% of all views. The proposal: allocate 20% of feed slots to creators with fewer than 10K followers. How would you measure success?

Ecosystem thinkingMetric definitionTradeoff reasoningExperiment design
Start session →
CoinbaseProduct SenseHard~30 min
Coinbase Earn Feature

Coinbase launched the Earn feature — users watch short educational videos about crypto projects and earn small amounts of that cryptocurrency. The feature is live and the PM wants to decide whether to expand it. How would you measure success, and what does the data tell you?

Metric definitionData interpretationUnit economicsEngagement vs. intent
Start session →
RedditProduct SenseHard~30 min
Measuring Subreddit Health

Reddit leadership wants a single 'Subreddit Health Score' metric displayed on an internal dashboard. You're tasked with defining it. The score should work across r/science (2M members, strict moderation) and r/memes (15M members, loose moderation).

Metric designCommunity heterogeneityStakeholder pushbackPractical constraints
Start session →
DoorDashProduct SenseHard~30 min
Ad Effectiveness Audit

DoorDash's Sponsored Listings product claims merchants see a 2.3x order lift, but the Growth team suspects selection bias is inflating the number. You need to audit true ad incrementality and determine whether the $18M/month ad product is delivering real value.

Selection bias identificationCausal inference methods (PSM, RDD, ghost ads)Incrementality measurementSegmented analysis by merchant tier
Start session →
DoorDashProduct SenseMedium~25 min
Store Search Launch

DoorDash launched Store Search — cross-restaurant item search (e.g., 'pad thai' returns results from multiple restaurants). Engagement looks great: 38% adoption, 15% of orders from search. But the VP of Product challenges whether it's creating new demand or just redistributing existing orders.

Incremental vs. cannibalized demand analysisFeature measurement with cannibalizationMerchant ecosystem health reasoningDecision fatigue and UX trade-offs
Start session →
DoorDashProduct SenseMedium~20 min
Marketplace Demand Metrics

You've been asked to define the core metrics framework for understanding marketplace demand at DoorDash. This goes beyond simple order counts — you need metrics that capture demand health, unmet demand, and early warning signals for demand-supply imbalances.

Metric framework designMarketplace dynamicsDemand forecasting
Start session →
DoorDashProduct SenseHard~25 min
DoorDash Key Metrics Framework

Leadership wants a single health scorecard that captures the entire DoorDash marketplace across consumers, merchants, and Dashers. You need to define the metric hierarchy, handle conflicting metrics across sides, and design something stakeholders can actually use.

Metric hierarchy designThree-sided marketplace thinkingStakeholder communication
Start session →
DoorDashProduct SenseMedium~20 min
Alcohol Delivery Vertical Success

DoorDash launched alcohol delivery as a new vertical. You need to define success metrics that account for the unique characteristics of alcohol delivery — different purchase patterns, regulatory requirements, and potential cannibalization of food orders.

Vertical success measurementRegulatory compliance metricsCannibalization analysis
Start session →
SpotifyProduct SenseMedium~30 min
Discover Weekly Success Measurement

Spotify's Discover Weekly playlist has been live for years, but leadership wants a rigorous success measurement framework. The playlist generates high play counts, but the team debates whether play-through rate or save-to-library rate is the better signal of genuine discovery. You need to define success metrics, segment by user type, and propose an experiment design for testing algorithm changes.

Defining multi-dimensional success metrics for content featuresResolving tension between engagement and discovery signalsUser segmentation for feature evaluationExperiment design for recommendation algorithm changes
Start session →
SpotifyProduct SenseMedium~30 min
Spotify Wrapped Feature

Spotify Wrapped is the company's most viral annual feature. Leadership wants a comprehensive measurement framework that covers engagement, virality, retention, and brand impact. The challenge is that Wrapped is a once-a-year event, which makes traditional A/B testing nearly impossible. You need to define success metrics, measure social sharing impact, address the December DAU surge that doesn't stick, and propose evaluation methods for a feature you can only 'launch' once per year.

Multi-dimensional success measurement for cultural featuresViral loop analysis and social sharing measurementRetention analysis for event-driven engagement spikesEvaluation methodology when A/B testing isn't feasible
Start session →
NetflixProduct SenseMedium~30 min
Recommendation Engine Success Measurement

Netflix's recommendation engine drives 80% of content played on the platform, but leadership wants to understand whether it's truly successful or just convenient. The team debates whether click-through rate, completion rate, or long-term retention lift is the right north star. You need to define a rigorous success framework, handle the tension between 'good enough' recommendations and genuinely great ones, and design an experiment to test a new algorithm.

Defining success metrics for recommendation systemsDistinguishing good recommendations from convenient onesBalancing engagement vs. discovery in algorithmic systemsExperiment design for algorithm changes
Start session →
NetflixProduct SenseMedium~30 min
Homepage PM — Success Metrics

You are the PM for the Netflix homepage. Leadership asks: 'Is the homepage successful?' The challenge is disentangling homepage performance from overall Netflix performance. The homepage has recently been redesigned with larger hero content and fewer visible rows, and there's a debate about whether faster time-to-play or deeper browsing is the better outcome.

Scoping product-level vs. feature-level metricsEvaluating competing success signalsHandling user segment tradeoffs in product decisionsConnecting feature metrics to business outcomes
Start session →
NetflixProduct SenseHard~30 min
'Because You Watched' Discovery Metric

Netflix is launching a new personalized row labeled 'Because you watched [title]' to improve content discovery. The team needs a single North Star metric for discovery quality AND explicit anti-gaming constraints to ensure the metric reflects genuine discovery rather than inflated engagement. The previous row iteration increased play starts but not watch-through — a red flag for metric inflation.

Designing anti-gaming constraints for engagement metricsDistinguishing genuine engagement from metric inflationNorth Star metric definition for content discoveryReasoning about autoplay and passive engagement artifacts
Start session →

Navigating Tradeoffs

Multi-stakeholder reasoning, optimization under constraints

MetaProduct SenseHard~25 min
Instagram Ad Load Optimization

The Instagram Monetization team wants to double the amount of ads shown on Instagram — it's the quickest way to nearly double revenue overnight. How would you determine the optimal ad-load for Instagram?

Tradeoff reasoningStakeholder awarenessExperiment designLong-term thinking
Start session →
MetaProduct SenseMedium~20 min
Events Click Increase — Tradeoffs

Facebook Search has observed a 10% increase in clicks on Events. Before celebrating, you need to figure out whether this is a good sign, a bad sign, or something more nuanced. The PM wants your analysis.

Tradeoff analysisHealthy vs. unhealthy engagementDownstream measurement
Start session →
GoogleProduct SenseMedium~25 min
YouTube Comments vs Watch Time

YouTube's recommendation algorithm update increased comments by 22% but decreased average watch time per session by 11%. The algorithm is surfacing more short, controversial, and opinion-driven content that generates comments but pulls users away from long-form content. The VP of Product wants to know which metric matters more for long-term retention.

Metric hierarchy and prioritizationShort-term vs. long-term tradeoff reasoningPlatform health and creator ecosystem thinkingEngagement quality vs. quantity distinction
Start session →
TikTokProduct SenseHard~30 min
TikTok LIVE Monetization

TikTok LIVE's gift revenue is up 20% month-over-month, but the number of unique gift senders is down 10%. The growth team celebrates the revenue; the trust & safety team is worried. You're asked to investigate.

Metric decompositionRisk assessmentStakeholder reasoningRevenue analysis
Start session →
CoinbaseProduct SenseHard~30 min
Trading Fee Reduction

Coinbase is considering a 20% fee reduction (0.5% to 0.4% per trade) to compete with lower-cost exchanges. Design the experiment and evaluate whether it's worth shipping.

Experiment designBreak-even analysisSegmentationLong-term reasoning
Start session →
RedditProduct SenseHard~30 min
Feed Ranking — Community Health vs. Engagement

Reddit is considering a new ranking algorithm that surfaces posts from smaller subreddits users haven't joined. Early tests show higher DAU but increased toxicity in small communities. The Growth team is excited; the Trust & Safety team is worried.

Community reasoningTradeoff evaluationModerator awarenessGuardrail thinking
Start session →
DoorDashProduct SenseMedium~20 min
Conflicting Marketplace Metrics

An experiment improves consumer conversion by 3% but hurts Dasher acceptance rate by 2%. You need to recommend whether to launch. This tests your ability to reason about two-sided marketplace trade-offs without defaulting to a binary decision.

Two-sided marketplace reasoningSecond-order effect analysisNuanced decision-making under trade-offsStakeholder communication
Start session →
DoorDashProduct SenseHard~25 min
Dasher Payout System Design

DoorDash is redesigning the Dasher payout structure. The current system has complaints about unpredictability and perceived unfairness. You need to design a new system that improves Dasher satisfaction while maintaining platform economics.

Incentive designMulti-objective optimizationExperiment design
Start session →
DoorDashProduct SenseHard~25 min
Grocery Cannibalization Analysis

DoorDash's grocery delivery vertical has grown rapidly, but the restaurant delivery team is concerned about cannibalization. You need to determine whether grocery is additive to the platform or stealing from restaurant orders.

Cannibalization analysisCausal inferenceUser-level behavior tracking
Start session →
DoorDashProduct SenseHard~25 min
Sponsored Listings Effectiveness

DoorDash runs an ad marketplace where restaurants pay for sponsored placement in search results. You need to measure whether these ads actually drive incremental orders or just cannibalize organic traffic, while balancing the interests of merchants, consumers, and the platform.

Ad incrementality measurementThree-sided marketplace thinkingFairness analysis
Start session →
SpotifyProduct SenseHard~35 min
Recommendation System Improvement

Spotify's recommendation system drives 35% of all listening hours. The ML team has a new model that improves click-through rate by 8% but reduces the diversity of recommended artists by 22%. The team is debating whether to ship it. You need to evaluate tradeoffs beyond CTR, define what 'good recommendations' means for a music platform, and propose a framework for balancing exploitation (playing what users already like) with exploration (helping users discover new music).

Tradeoff evaluation beyond single-metric optimizationExploration vs. exploitation framework for recommendationsMulti-stakeholder analysis (users, artists, platform)Long-term vs. short-term metric tension
Start session →

Designing Experiments

A/B tests, randomization, interpreting mixed results

MetaProduct SenseHard~25 min
Groups Feature Experiment

Facebook Groups wants to test a new feature: AI-generated discussion prompts posted automatically in groups with declining activity. The goal is to re-engage dormant groups. How would you design this experiment?

Experiment designMetric definitionRisk assessmentStakeholder thinking
Start session →
MetaProduct SenseHard~25 min
Marketplace Delivery Experiment

You're a data scientist at a food delivery company. The operations team ran a test where they gave Dashers a guaranteed minimum per delivery in a specific market. Results are in — but they're mixed.

Data interpretationTradeoff reasoningExperiment rigorStakeholder awareness
Start session →
MetaProduct SenseHard~25 min
Instagram Reels on Newsfeed

Facebook is integrating Reels into the main Newsfeed experience. This is a major surface change that could boost short-form video consumption but risks cannibalizing existing content types. You need to design the experiment and measurement plan.

Experiment designCannibalization analysisMetric tradeoffs
Start session →
MetaProduct SenseMedium~20 min
Messenger Video Call Improvement

The Messenger team wants to improve the video call experience. They have several ideas — better connection quality, AR filters, screen sharing, and background blur. You need to design an experiment to test improvements and define what success looks like.

Experiment designMetric definitionUnit of randomization
Start session →
AmazonProduct SenseHard~30 min
Search Ranking — Seller Quality

Amazon's search team developed a new ranking algorithm that prioritizes seller quality (reviews, fulfillment speed, return rate) more heavily. Design the experiment and evaluate the results.

Experiment designMarketplace reasoningSeller ecosystemTradeoff evaluation
Start session →
UberProduct SenseHard~30 min
Surge Pricing Redesign Experiment

Uber is replacing discrete surge tiers (1.0x, 1.2x, 1.5x, 2.0x) with dynamic continuous surge (any multiplier from 1.0x to 3.0x, updated every minute). Design the experiment and evaluate the results.

Marketplace experiment designSwitchback vs. geo-level randomizationSpillover awarenessTwo-sided metric definition
Start session →
RedditProduct SenseMedium~25 min
Onboarding Personalization Experiment

Reddit is redesigning new-user onboarding. Current: shows popular subreddits. Proposed: personalized recommendations based on stated interests. Design the experiment and evaluate the results.

Experiment designData interpretationTradeoff reasoningReddit-specific awareness
Start session →
DoorDashProduct SenseHard~30 min
Dispatch Dilemma

DoorDash is launching in Austin (sprawling, car-dependent) and Portland (compact, bike-friendly). The current dispatch algorithm optimizes solely for distance-to-restaurant and was trained on suburban markets. It systematically under-assigns bike Dashers in dense areas. Design an experiment to test city-optimized dispatch.

Marketplace experiment designGeo-level vs. user-level randomizationBias detection in algorithm designTwo-sided marketplace metrics
Start session →
DoorDashProduct SenseMedium~25 min
Dasher Incentive Trap

DoorDash launched Peak Alerts — push notifications urging offline Dashers to go online during high-demand periods. Headline metrics look great: 28% more Dashers during alerts, 62% fewer unfulfilled orders. But deeper analysis may reveal that Peak Alerts cannibalize off-peak supply instead of creating net new supply.

Supply redistribution analysisNet vs. gross impact measurementNotification fatigue recognitionTwo-sided marketplace fairness reasoning
Start session →
DoorDashProduct SenseHard~25 min
DashPass Pricing Experiment

The subscription team wants to test a $2/month price increase for DashPass (from $9.99 to $11.99). You're the data scientist tasked with designing this experiment. Pricing experiments have unique challenges — long-term churn effects aren't visible in short tests, and existing vs. new subscribers react very differently.

Pricing experiment designLong-term vs. short-term measurementSubscriber segmentationRevenue vs. retention trade-offs
Start session →
DoorDashProduct SenseHard~25 min
Delivery Fee Coupon Incrementality

The marketing team wants to test a 5% off delivery fee coupon to drive consumer orders. But you know that many orders would have happened anyway. How do you measure the true incremental impact when marketplace dynamics create spillover effects?

Incrementality vs. gross impact measurementMarketplace-aware experiment designGeo-based randomization reasoningConfound identification
Start session →
DoorDashProduct SenseHard~20 min
Switchback Test Design

You're testing a new real-time batching algorithm that assigns 2 orders to a single Dasher trip. This affects supply, demand, and timing across the entire local marketplace. Standard A/B testing breaks down because of interference effects — you need a switchback design.

When standard A/B testing failsSwitchback test design principlesMarketplace interference reasoningMixed results interpretation
Start session →
DoorDashProduct SenseMedium~20 min
Dasher Push Notification Effectiveness

DoorDash sends push notifications to offline Dashers during high-demand periods, asking them to go online. The operations team wants to know if this is actually effective at increasing supply, or if Dashers would have come online anyway.

A/B test designMarketplace metricsIncrementality measurement
Start session →
DoorDashProduct SenseHard~25 min
Restaurant Ranking Algorithm Change

DoorDash is testing a new restaurant ranking algorithm that prioritizes delivery speed and proximity over ratings and popularity. Early data looks promising for delivery times, but there are concerns about fairness to merchants and consumer discovery.

Experiment designThree-sided marketplace thinkingFairness analysis
Start session →
DoorDashProduct SenseHard~25 min
Late Deliveries and Churn

The hypothesis is that late deliveries cause customer churn. But you can't randomly assign late deliveries — you need to find a clever identification strategy to prove or disprove the causal relationship and size the business impact.

Causal inferenceNatural experiment identificationBusiness impact sizing
Start session →
NetflixProduct SenseMedium~30 min
Free Trial Effectiveness

Netflix historically offered a 30-day free trial to new users, but recently eliminated it in many markets. Leadership wants to understand: was the free trial actually effective, and was removing it the right call? You need to define how you'd measure trial effectiveness, detect trial abuse, compare trial lengths, and evaluate the removal decision retroactively.

Measuring acquisition funnel effectivenessDetecting and quantifying fraud/abuseA/B test design for trial length optimizationRetroactive decision evaluation with observational data
Start session →
NetflixProduct SenseMedium~25 min
Trailer Autoplay — Real Win or Metric Inflation?

Netflix is testing a new trailer autoplay behavior on title cards — when users hover for 2 seconds, a 30-second trailer starts playing. The test shows higher play starts but unchanged total watch time. The team is split on whether this is a real UX improvement that helps users discover content faster, or metric inflation that makes the dashboard look better without creating real value.

Distinguishing real UX wins from metric inflationDesigning disambiguation metricsEvaluating features with mixed signalsUnderstanding how UI mechanics inflate engagement metrics
Start session →

Making Strategic Decisions

Framework building, competitive analysis, recommendations

MetaProduct SenseHard~25 min
Reels vs. Stories Investment

Instagram's leadership is debating where to invest their next engineering cycle: doubling down on Reels (short-form video) or revamping Stories (ephemeral content). You're asked to present a data-driven framework for making this decision.

Strategic reasoningFramework buildingData-driven decision-makingMarket awareness
Start session →
MetaProduct SenseHard~25 min
Restaurants Recommender on Newsfeed

Facebook is considering adding a 'restaurants you may like' recommender in the News Feed. This would be a new content type competing for feed real estate alongside posts, Reels, ads, and stories. You need to design the system and measurement plan.

Recommender system designSignal selectionCannibalization analysisCold start
Start session →
MetaProduct SenseHard~25 min
WhatsApp Spam Detection

WhatsApp is seeing a surge in spam messages across both individual and group chats. You've been asked to design a spam detection system. The catch: WhatsApp is end-to-end encrypted, which limits what data you can use.

System designPrivacy-constrained MLPrecision/recall tradeoffsMeasurement
Start session →
MetaProduct SenseHard~25 min
Fake News in Stories

Meta's integrity team needs to estimate the prevalence of misinformation in Facebook Stories. Unlike the News Feed, Stories are ephemeral (24-hour lifespan) and often contain images, video, and text overlays — making automated detection harder. You need to design the measurement approach.

Measurement methodologySampling strategyClassification systemsUncertainty communication
Start session →
GoogleProduct SenseMedium~25 min
Gmail User Base Growth

Gmail has 1.8 billion accounts but active user growth has plateaued at 2% annually. The VP of Gmail wants to increase the active user base by 15% in 18 months. You need to define the right metrics, identify where the activation funnel breaks, and design a referral experiment to drive growth beyond organic sign-ups.

Growth metric design beyond vanity metricsActivation funnel analysisReferral experiment designMarket segmentation for growth strategy
Start session →
DoorDashProduct SenseHard~25 min
Wrong Order Prediction System

Wrong orders are one of DoorDash's biggest customer pain points and refund cost drivers. You've been asked to design a machine learning system that minimizes wrong orders — but you first need to clarify what 'minimize' means and where in the pipeline to intervene.

ML system designProblem framingPrecision/recall tradeoffsFeature engineering
Start session →
DoorDashProduct SenseHard~25 min
International Expansion Prioritization

DoorDash is considering expanding into 5 international markets. You need to build a prioritization framework that balances demand potential, operational feasibility, competitive landscape, and unit economics — then design a pilot to validate before committing.

Strategic prioritizationMarket sizingPilot designUnit economics
Start session →
DoorDashProduct SenseHard~25 min
Consumer Lifetime Value Estimation

You need to estimate consumer lifetime value (LTV) from first 30 days of behavior. This model will inform acquisition spend, retention investment, and growth strategy. The challenge: early behavior is noisy, promotional effects inflate initial usage, and different customer segments have wildly different trajectories.

LTV modelingFeature selectionSegmentationBusiness application
Start session →
DoorDashProduct SenseMedium~20 min
Refund Abuse Detection

DoorDash's refund costs are growing faster than order volume. Customer support suspects a significant portion of refund claims are fraudulent — users claiming missing or wrong items that were actually delivered correctly. You need to design a detection system and quantify the financial impact.

Fraud detection designThreshold settingROI measurement
Start session →
SpotifyProduct SenseHard~35 min
Podcast Expansion — Customer Lifetime Value

Spotify has invested heavily in podcasts. Early data shows podcast listeners retain 25% longer than music-only users, but their music streaming hours declined 18%. Leadership wants to understand the true impact of podcasts on customer lifetime value. There's a debate about whether podcast engagement genuinely increases CLV or if podcast adopters were already higher-value users (selection bias). You need to design an approach to isolate the causal effect.

Causal inference and selection bias identificationCustomer lifetime value decompositionStrategic tradeoff analysis between content typesExperiment design for content strategy evaluation
Start session →
SpotifyProduct SenseMedium~30 min
Free Trial to Paid Conversion

Spotify's 1-month free Premium trial converts at 46%, but 54% of trial users revert to the ad-supported tier. The Growth team wants to understand which behaviors during the trial predict conversion, what nudges could improve conversion, and whether testing different trial lengths (7-day, 14-day, 3-month) could improve outcomes. You need to build a framework for diagnosing non-conversion and designing interventions.

Behavioral prediction from product usage dataConversion funnel optimizationExperiment design for pricing/trial structureNudge design grounded in user behavior signals
Start session →
NetflixProduct SenseHard~35 min
Netflix Original Performance

Netflix has invested $17B in original content and needs a rigorous framework for measuring the performance of Netflix Originals. The challenge is that Originals serve multiple purposes — driving subscriptions, reducing churn, building brand, and filling catalog gaps. A simple viewership metric misses most of the value. You need to build a framework that informs renewal vs. cancellation decisions.

Multi-dimensional value measurement for content investmentsIndirect value attribution (brand, retention, acquisition)Renewal vs. cancellation decision frameworksNavigating tension between viewership and strategic value
Start session →
NetflixProduct SenseMedium~25 min
Inactive User Reactivation

One million Netflix users have been inactive for six months — they're paying subscribers who haven't watched anything. Product and marketing teams want to re-engage them, but finance questions whether it's worth the investment. You need to segment these users, design a reactivation strategy, measure its effectiveness, and determine when the cost of reactivation isn't worth it.

User segmentation for reactivationROI framework for reactivation campaignsWin-back campaign design and measurementDeciding when not to invest in reactivation
Start session →