Distinctions in Data & Measurement (数据与测量的区分)
1. Data, Elements, Variables, Observations (数据、元素、变量、观测值的区分)
- 1.1 Data (数据)
- Definition / 定义: Fact-based information such as numbers, figures, tables. (基于事实的信息,例如数字、图形、表格。)
- Key Idea / 核心要点: Data = the whole collection (整张表或数据集).
- Example / 例子: January sales = 2000 units. (一月份销售额 = 2000件。)
- 1.2 Elements (元素)
- Definition / 定义: Objects/entities on which data are collected. (数据收集的对象或实体。)
- Example / 例子: A student, a product, a country. (学生、产品、国家。)
- 1.3 Variables (变量)
- Definition / 定义: Characteristics or attributes of elements. (元素的特征或属性。)
- Example / 例子: Age, gender, GPA. (年龄、性别、绩点。)
- 1.4 Observations (观测值)
- Definition / 定义: The set of values for each variable on an element. (元素在各变量上的一组取值。)
- Example / 例子: Student A: gender = female, age = 20, GPA = 3.5. (学生A:性别=女,年龄=20,GPA=3.5。)

2. Data Types (数据类型的区分)
- 2.1 Categorical Data (分类数据)
- Categorical Data
- Definition / 定义: Group-based or label-based (Nominal & Ordinal). (基于组别或标签的数据,包括名义和顺序。)
- Example / 例子: Gender, opinion (male/female, agree/disagree). (性别,意见。)
- Visualization / 可视化: Bar chart, pie chart. (条形图、饼图。)
- 2.1.1 Nominal (名义尺度)
- Definition / 定义: Classification without order. (分类,无顺序。)
- Example / 例子: Gender (男/女), Blood type (血型).
- Key Use / 用途: Counting and grouping only. (用于计数和分组。)
- 2.1.2 Ordinal (顺序尺度)
- Definition / 定义: Ordered but intervals not equal. (有顺序,但间隔不一定相等。)
- Example / 例子: Satisfaction rating 1–5. (满意度1–5。)
- Key Use / 用途: Ranking analysis. (排序分析。)
- 2.2 Quantitative Data (数量数据)
- Quantitative Data
- Definition / 定义: Numeric with measurable meaning (Interval & Ratio). (有度量意义的数值,包括区间和比率。)
- Example / 例子: Age, distance, income. (年龄、距离、收入。)
- Visualization / 可视化: Histogram, line chart. (直方图、折线图。)
- 2.2.1 Interval (区间尺度)
- Definition / 定义: Ordered, equal intervals, no true zero. (有顺序,间隔相等,无绝对零点。)
- Example / 例子: Celsius temperature, calendar years. (摄氏温度、年份。)
- Key Use / 用途: Differences are meaningful, ratios meaningless. (差值有意义,比例无意义。)
- 2.2.2 Ratio (比率尺度)
- Definition / 定义: Ordered, equal intervals, with true zero. (有顺序,间隔相等,有绝对零点。)
- Example / 例子: Income, weight, age, distance. (收入、体重、年龄、距离。)
- Key Use / 用途: All arithmetic including ratios. (可进行所有算术运算,包括比例。)

3. Quick Rules to Distinguish (快速区分法则)
- Step 1 → Ask: Categorical or Quantitative? (先判断分类还是数量)
- Step 2 → If Categorical → Nominal or Ordinal? (分类数据 → 名义或顺序)
- Step 3 → If Quantitative → Interval or Ratio? (数量数据 → 区间或比率)
- Nominal / 名义: Just labels. (只有标签)
- Ordinal / 顺序: Has order, no equal gaps. (有顺序,无相等间隔)
- Interval / 区间: Equal intervals, no true zero. (等间隔,无零点)
- Ratio / 比率: Equal intervals + true zero. (等间隔+有零点)
4. Common Confusions (常见混淆点)
- Numbers as labels / 数字作标签: ZIP code, product ID = Nominal. (邮编、编号 → 名义尺度)
- Likert scale / 李克特量表: Satisfaction 1–5 = Ordinal, often treated ~Interval. (满意度1–5 = 顺序,常近似区间)
- Temperature / 温度: Celsius/Fahrenheit = Interval; Kelvin = Ratio. (摄氏/华氏=区间;开尔文=比率)
- Age vs Year / 年龄与年份: Age = Ratio; Birth year = Interval. (年龄=比率;出生年份=区间)
- Income / 收入: Ratio (0=no income). Negative possible but ratios across signs meaningless. (比率;跨正负倍数无意义)
- Ranks / 名次: Ordinal not Interval. (名次=顺序,不是区间)
5. Mini Cheat-Sheet (速查表)
- Data / 数据: Whole collection (整体信息)
- Elements / 元素: Rows/things (行/对象)
- Variables / 变量: Columns/features (列/特征)
- Observations / 观测值: One row’s full record (单行完整记录)
- Categorical / 分类: Nominal + Ordinal
- Quantitative / 数量: Interval + Ratio