Question 1 — Designing a Frequency Distribution

A university surveyed 120 students on weekly hours spent on part-time jobs. The data ranges from 0 to 30 hours.
Design a frequency distribution table with 6 equal-width classes.

📖 点击查看答案 Class width = (30-0)/6 = 5 Classes: 0–4, 5–9, 10–14, 15–19, 20–24, 25–30
📝 点击解析 频数分布的关键是选择区间宽度。范围=30,分6组 → 每组宽度=5。 区间为 [0–4], [5–9], [10–14], [15–19], [20–24], [25–30]。

Question 2 — Choosing Number of Classes

A dataset of 5,000 sales records needs to be grouped into a frequency distribution.
Would you recommend 5, 10, or 20 classes? Why?

📖 点击查看答案 20 classes.
📝 点击解析 数据量大时应选择更多组数以避免信息丢失。5000条数据若只分5组会过于粗糙,10组仍显不足。20组能更好展示分布细节。

Question 3 — Detecting Skewness

A histogram of monthly incomes shows most employees earn between 5k–8k RMB, with a long right tail up to 50k RMB.
What kind of skewness does this distribution show?

📖 点击查看答案 Right-skewed (positively skewed).
📝 点击解析 大多数数值集中在低区间,而高值少数但拉长尾巴 → 典型的正偏态。

Question 4 — Comparing Two Groups

Two classes recorded GPA distributions:

  • Class A: GPAs mostly between 3.0–3.5.
  • Class B: GPAs spread evenly between 2.0–4.0.
    Which class shows greater variability, and why?
📖 点击查看答案 Class B shows greater variability.
📝 点击解析 Class A数据集中,说明离散程度小;Class B均匀分布在更宽区间,说明变异性更大。

Question 5 — Choosing Class Width

A manager records repair costs ranging from 120 RMB to 520 RMB.
If she wants 8 classes, what class width should she use?

📖 点击查看答案 Width = (520–120)/8 = 50 RMB
📝 点击解析 范围=400,分8组 → 组宽=50。每组如120–169, 170–219… 直到520。

Question 6 — Misleading Frequency Table

If a frequency table uses unequal class widths, why might this be misleading? Give one scenario.

📖 点击查看答案 Unequal widths distort comparisons; e.g., comparing sales in a 0–100 interval vs. 101–110 interval exaggerates smaller group frequencies.
📝 点击解析 不等组宽会导致比例失真。例:一组宽100,另一组仅10,但用频数直接比较,会误导读者。

Question 7 — Cumulative Frequency Interpretation

In a GPA distribution, cumulative % below 3.0 = 40%.
What does this mean in plain language?

📖 点击查看答案 40% of students have GPA less than 3.0.
📝 点击解析 累积频率是“到某点为止”的比例。小于3.0的GPA学生占40%。

Question 8 — Business Application

A retailer uses frequency distribution of daily sales to set staffing levels.
How can this information be applied?

📖 点击查看答案 By identifying peak sales ranges, the retailer can allocate more staff on busy days.
📝 点击解析 频率分布显示高销量日的集中区间 → 企业可据此调整人力配置,避免排队和顾客流失。

Question 9 — Comparing Frequency Tables

Two stores show:

  • Store A: Most sales in 100–200 RMB range.
  • Store B: Sales evenly spread across 50–500 RMB.
    Which store has more consistent customer spending?
📖 点击查看答案 Store A has more consistent spending.
📝 点击解析 Store A顾客支出集中;Store B分布较分散,说明消费不稳定。

Question 10 — Designing a Histogram

Why is it inappropriate to create a histogram for categorical data?

📖 点击查看答案 Histograms require numerical intervals; categorical data lacks inherent order and equal spacing.
📝 点击解析 直方图依赖区间和数轴;分类数据(如性别、颜色)无连续性和等距概念,应使用条形图。

Question 11 — Ogive Usage

A company plots an ogive of delivery times.
How could this be used to guarantee service-level targets?

📖 点击查看答案 By reading cumulative % at specific time thresholds (e.g., 95% deliveries < 3 days).
📝 点击解析 累积曲线可直接读出“多少比例在某时限内完成”。这可用于KPI设定,如保证95%在3天内送达。

Question 12 — Impact of Grouping

How does grouping continuous data into classes affect detail? Provide one advantage and one disadvantage.

📖 点击查看答案 Advantage: Simplifies analysis. Disadvantage: Loses exact data points.
📝 点击解析 优点:数据简化,趋势清晰。缺点:具体值被隐藏,精度下降。

Question 13 — Business Risk Example

Insurance claims are grouped into frequency classes.
How might cumulative frequency help risk managers?

📖 点击查看答案 It shows % of claims below a threshold, useful for setting premiums or reserves.
📝 点击解析 累积频率揭示“多少索赔额低于某值”。保险公司可据此决定定价与准备金。

Question 14 — Wrong Grouping

A dataset ranges 52–109. Someone uses 5 classes width=15.
What’s wrong with this choice?

📖 点击查看答案 Width doesn’t cover entire range evenly → last class ends at 127, leaving gaps/empty intervals.
📝 点击解析 范围=57。用组宽15会超过范围,产生不必要的空区间。应选更合适宽度(≈10)。

Question 15 — Detecting Outliers

A frequency table of exam scores shows most students between 60–90, but 2 students scored 10 and 100.
How should this be treated?

📖 点击查看答案 Recognize them as outliers; consider reporting separately or adjusting class intervals.
📝 点击解析 极端值会扭曲频率分布。应单独标记或在分析中说明,而非简单纳入常规组。

Question 16 — Comparing Cumulative Curves

Two products’ cumulative sales curves are plotted.
Product A reaches 80% sales by mid-year; Product B only 40%.
Which has faster early sales growth?

📖 点击查看答案 Product A has faster early sales growth.
📝 点击解析 累积曲线陡峭表示增长快。Product A半年内完成80%销量,远超Product B。

Question 17 — Cross-Analysis

A university collects GPA data by gender.
What kind of table/graph should be used to compare frequency distributions between groups?

📖 点击查看答案 A side-by-side histogram or grouped frequency table.
📝 点击解析 分组直方图/频率表能直观对比男女生分布差异。

Question 18 — Pareto Principle

A store finds 20% of products account for 80% of sales.
Which chart best shows this?

📖 点击查看答案 Pareto chart.
📝 点击解析 帕累托图结合条形和累积线,最能体现“二八法则”。

Question 19 — Business Forecast

A bank groups loan defaults by age bracket.
How might this frequency distribution inform future lending policies?

📖 点击查看答案 By identifying high-default age groups, the bank can adjust credit policies or require stricter checks.
📝 点击解析 频率分布揭示违约集中群体 → 银行可相应提高利率或审查门槛。

Question 20 — Combining Categorical & Quantitative Data

A survey collects both customer age (quantitative) and preferred brand (categorical).
How should data be analyzed?

📖 点击查看答案 Cross-tabulation (pivot table) showing brand preference by age group.
📝 点击解析 通过交叉表把年龄分组(数量型)和品牌偏好(分类型)结合,能看出不同年龄层的消费趋势。