β˜• The Data-Driven Barista

Bridging the gap between the counter and the textbook.

πŸ’» Simulation (Monte Carlo)

"What could happen tomorrow?"

Reality You know your average is 15 customers/hour. You use the computer to "play out" 1,000 fake hours to see how bad the "rushes" get.

The Data: You feed the model one number: $\lambda = 15$. The computer then generates random outcomes based on that rate.
Click to run...

Mathematical Model: Poisson PMF

$$P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!}$$

\(\lambda\) (Lambda): The average rate (your "known" data: 15/hr).

\(k\): A specific "What if" number of customers (e.g., 25).

\(P(X=k)\): The probability that exactly \(k\) people show up.

What this number represents: It tells you the Probability of Failure. If the simulation says there is a 5% chance of $k > 25$, you know you need at least 25 cups ready to be 95% safe.

πŸ“ Estimation (Bootstrap)

"What is the true average?"

Reality You only have data for *3 days*. You don't know the "true" average of the shop. You resample your 3 days over and over to find a safety range.

The Data: Your real samples: [12, 18, 15]. You "bootstrap" by picking from these three numbers (with replacement) to build a distribution.
Estimated Milk Needed

15.5 Gal

95% CI: [13.2, 17.8]

Statistical Method: Percentile Bootstrap

$$CI = [\theta^_{2.5\%}, \theta^_{97.5\%}]$$

\(\theta^*\): The statistic (mean) calculated from a "resampled" data set.

\(CI\): The Confidence Interval (your safety net).

What this number represents: It accounts for Sampling Error. It says: "Based on my 3 days of data, I am 95% confident the real long-term average is between 13.2 and 17.8."

πŸ” EDA (Distribution Analysis)

"Who are these people?"

Reality You plot age vs. sugar. You find a "Two-Hump Mystery."

The Data: You have 100 rows of (Age, Sugar). The average is 4 packets, but the plot shows two distinct groups (Teens and Seniors).

Distribution: Bimodal Mixture

$$f(x) = w_1 N(\mu_1, \sigma_1) + w_2 N(\mu_2, \sigma_2)$$

\(w_1, w_2\): The weight (how many are young vs. old).

\(\mu\): The average sugar for each hump.

What this number represents: It shows that the Population is Heterogeneous. It tells you that your "average" is a lieβ€”you actually have two separate markets to serve.