2890. 重塑数据:融合
2890. 重塑数据:融合
题目
DataFrame report
+-------------+--------+ | Column Name | Type | +-------------+--------+ | product | object | | quarter_1 | int | | quarter_2 | int | | quarter_3 | int | | quarter_4 | int | +-------------+--------+
Write a solution to reshape the data so that each row represents sales data for a product in a specific quarter.
The result format is in the following example.
Example 1:
Input:
+-------------+-----------+-----------+-----------+-----------+ | product | quarter_1 | quarter_2 | quarter_3 | quarter_4 | +-------------+-----------+-----------+-----------+-----------+ | Umbrella | 417 | 224 | 379 | 611 | | SleepingBag | 800 | 936 | 93 | 875 | +-------------+-----------+-----------+-----------+-----------+
Output:
+-------------+-----------+-------+ | product | quarter | sales | +-------------+-----------+-------+ | Umbrella | quarter_1 | 417 | | SleepingBag | quarter_1 | 800 | | Umbrella | quarter_2 | 224 | | SleepingBag | quarter_2 | 936 | | Umbrella | quarter_3 | 379 | | SleepingBag | quarter_3 | 93 | | Umbrella | quarter_4 | 611 | | SleepingBag | quarter_4 | 875 | +-------------+-----------+-------+
Explanation:
The DataFrame is reshaped from wide to long format. Each row represents the sales of a product in a quarter.
题目大意
DataFrame report
+-------------+--------+ | Column Name | Type | +-------------+--------+ | product | object | | quarter_1 | int | | quarter_2 | int | | quarter_3 | int | | quarter_4 | int | +-------------+--------+
编写一个解决方案,将数据 重塑 成每一行表示特定季度产品销售数据的形式。
结果格式如下例所示:
示例 1:
输入:
+-------------+-----------+-----------+-----------+-----------+ | product | quarter_1 | quarter_2 | quarter_3 | quarter_4 | +-------------+-----------+-----------+-----------+-----------+ | Umbrella | 417 | 224 | 379 | 611 | | SleepingBag | 800 | 936 | 93 | 875 | +-------------+-----------+-----------+-----------+-----------+
输出:
+-------------+-----------+-------+ | product | quarter | sales | +-------------+-----------+-------+ | Umbrella | quarter_1 | 417 | | SleepingBag | quarter_1 | 800 | | Umbrella | quarter_2 | 224 | | SleepingBag | quarter_2 | 936 | | Umbrella | quarter_3 | 379 | | SleepingBag | quarter_3 | 93 | | Umbrella | quarter_4 | 611 | | SleepingBag | quarter_4 | 875 | +-------------+-----------+-------+
解释:
DataFrame 已从宽格式重塑为长格式。每一行表示一个季度内产品的销售情况。
解题思路
在 Pandas 中,可以使用 melt()
方法将宽表(Wide Format Table)转换为长表(Long Format Table)。这是数据整理中的常见操作,用于将列数据转为行数据。
- 指定保留列 (
id_vars
):- 使用
['product']
指定不需要转换的列,这些列会在结果表中保留为固定的列。
- 使用
- 设置新列名 (
var_name
):- 使用
var_name='quarter'
为转换后的列名(即原始宽表中的列名)命名为'quarter'
。
- 使用
- 设置新值列名 (
value_name
):- 使用
value_name='sales'
为原始宽表中列对应的值命名为'sales'
。
- 使用
- 输出结果:
- 转换后的表是一个长表,其中每行包含
product
、quarter
和对应的sales
值。
- 转换后的表是一个长表,其中每行包含
复杂度分析
- 时间复杂度:
O(n * m)
,其中n
是原始表的行数,m
是原始表的列数。melt()
需要遍历所有列并重构为行。 - 空间复杂度:
O(n * m)
,需要分配内存存储转换后的长表,每个原始值会占用一行。
代码
import pandas as pd
def meltTable(report: pd.DataFrame) -> pd.DataFrame:
return pd.melt(report, id_vars=['product'], var_name='quarter', value_name='sales')