Business Statistics — Lecture 2 (商业统计 第2讲)


1. Data, Elements, Variables, Observations (数据、要素、变量、观测值)

Explanation / 解释

  • Data are fact-based information (数字、图表、表格) for collection, analysis, interpretation. (数据是基于事实的信息,用于收集、分析和解释)
  • Elements are the entities from which data are collected. (要素是收集数据的对象/实体)
  • Variables are characteristics of elements. (变量是要素的特征/属性)
  • Observations are sets of measurements for each element. (观测值是针对要素的测量数据集合)

Example / 例子

  • Data: Company sales = $500,000 in Jan. (一月公司销售额 = 50万美元)
  • Element: Each student in a class. (每个学生)
  • Variable: Age, gender, GPA. (年龄、性别、绩点)
  • Observation: {Age=20, Gender=F, GPA=3.5}. (观测值记录在行里)

Extension / 拓展

  • Data are the foundation of decision-making. (数据是决策的基础)
  • Identifying elements ensures representativeness. (识别要素保证代表性)
  • Variables can be qualitative or quantitative. (变量可定性/定量)
  • Observations are stored in dataset rows. (观测值存放在数据行)

2. Scales of Measurement (测量尺度)

Explanation / 解释

  • Four scales: nominal, ordinal, interval, ratio. (四类尺度:名义、顺序、区间、比率)

Example / 例子

  • Nominal: Gender (Male/Female). (名义:性别 男/女)
  • Ordinal: Rating (Very good=5, Very bad=1). (顺序:评分 非常好=5, 非常差=1)
  • Interval: Temperature (°C). (区间:温度 °C)
  • Ratio: Distance, probability. (比率:距离、概率)

Extension / 拓展

  • Correct scale selection impacts valid analysis. (正确的尺度选择决定有效分析方法)

3. Data Types (数据类型)

Explanation / 解释

  • Categorical (Nominal/Ordinal). (类别数据:名义/顺序)
  • Quantitative (Interval/Ratio). (定量数据:区间/比率)

Example / 例子

  • Categorical: Male=1, Female=2; Very good=5, Very bad=1.
    (类别:男=1, 女=2; 非常好=5, 非常差=1)
  • Quantitative: Distance 10km vs. 3km; Loss chance 0.5 vs. 0.3.
    (定量:距离10公里 vs. 3公里;亏损概率0.5 vs. 0.3)

Extension / 拓展

  • Categorical: analyzed with counts, percentages. (类别数据用频数/百分比分析)
  • Quantitative: supports mean, variance, regression. (定量数据支持均值、方差、回归分析)

4. Data Collection Methods (数据收集方式)

Explanation / 解释

  • Cross-sectional: data at one point in time. (横截面数据:某一时间点收集)
  • Time series: data across periods. (时间序列:多个时间段收集)

Example / 例子

  • Cross-sectional: May 2025 customer survey. (横截面:2025年5月顾客调查)
  • Time series: Sales from 2020–2025. (时间序列:2020–2025年销售额)

Extension / 拓展

  • Cross-sectional: good for group comparison. (横截面对比群体)
  • Time series: useful for trend, forecasting. (时间序列用于趋势和预测)

5. Data Sources (数据来源)

Explanation / 解释

  • Existing sources (internal/external). (现有来源:内部/外部)
  • Statistical studies (experimental/observational). (统计研究:实验/观察)

Example / 例子

  • Internal: sales records. (内部:销售记录)
  • External: industry reports. (外部:行业报告)

Extension / 拓展

  • Sources differ in cost, reliability, accessibility. (来源在成本、可靠性、可获得性上不同)

6. Data Acquisition Errors (数据获取误差)

Explanation / 解释

  • Errors in collection can be reduced by consistency checks/common sense.
    (数据收集中的误差可用一致性检查和常识减少)

Example / 例子

  • Reject outlier: Age = 200. (排除异常值:年龄=200)

Extension / 拓展

  • Data cleaning is essential before analysis. (分析前必须进行数据清理)