跳至主要內容

2890. 重塑数据:融合


2890. 重塑数据:融合

🟢   🔗 力扣open in new window LeetCodeopen in new window

题目

DataFrame report

+-------------+--------+
| Column Name | Type   |
+-------------+--------+
| product     | object |
| quarter_1   | int    |
| quarter_2   | int    |
| quarter_3   | int    |
| quarter_4   | int    |
+-------------+--------+

Write a solution to reshape the data so that each row represents sales data for a product in a specific quarter.

The result format is in the following example.

Example 1:

Input:

+-------------+-----------+-----------+-----------+-----------+
| product     | quarter_1 | quarter_2 | quarter_3 | quarter_4 |
+-------------+-----------+-----------+-----------+-----------+
| Umbrella    | 417       | 224       | 379       | 611       |
| SleepingBag | 800       | 936       | 93        | 875       |
+-------------+-----------+-----------+-----------+-----------+

Output:

+-------------+-----------+-------+
| product     | quarter   | sales |
+-------------+-----------+-------+
| Umbrella    | quarter_1 | 417   |
| SleepingBag | quarter_1 | 800   |
| Umbrella    | quarter_2 | 224   |
| SleepingBag | quarter_2 | 936   |
| Umbrella    | quarter_3 | 379   |
| SleepingBag | quarter_3 | 93    |
| Umbrella    | quarter_4 | 611   |
| SleepingBag | quarter_4 | 875   |
+-------------+-----------+-------+

Explanation:

The DataFrame is reshaped from wide to long format. Each row represents the sales of a product in a quarter.

题目大意

DataFrame report

+-------------+--------+
| Column Name | Type   |
+-------------+--------+
| product     | object |
| quarter_1   | int    |
| quarter_2   | int    |
| quarter_3   | int    |
| quarter_4   | int    |
+-------------+--------+

编写一个解决方案,将数据 重塑 成每一行表示特定季度产品销售数据的形式。

结果格式如下例所示:

示例 1:

输入:

+-------------+-----------+-----------+-----------+-----------+
| product     | quarter_1 | quarter_2 | quarter_3 | quarter_4 |
+-------------+-----------+-----------+-----------+-----------+
| Umbrella    | 417       | 224       | 379       | 611       |
| SleepingBag | 800       | 936       | 93        | 875       |
+-------------+-----------+-----------+-----------+-----------+

输出:

+-------------+-----------+-------+
| product     | quarter   | sales |
+-------------+-----------+-------+
| Umbrella    | quarter_1 | 417   |
| SleepingBag | quarter_1 | 800   |
| Umbrella    | quarter_2 | 224   |
| SleepingBag | quarter_2 | 936   |
| Umbrella    | quarter_3 | 379   |
| SleepingBag | quarter_3 | 93    |
| Umbrella    | quarter_4 | 611   |
| SleepingBag | quarter_4 | 875   |
+-------------+-----------+-------+

解释:

DataFrame 已从宽格式重塑为长格式。每一行表示一个季度内产品的销售情况。

解题思路

在 Pandas 中,可以使用 melt() 方法将宽表(Wide Format Table)转换为长表(Long Format Table)。这是数据整理中的常见操作,用于将列数据转为行数据。

  1. 指定保留列 (id_vars)
    • 使用 ['product'] 指定不需要转换的列,这些列会在结果表中保留为固定的列。
  2. 设置新列名 (var_name)
    • 使用 var_name='quarter' 为转换后的列名(即原始宽表中的列名)命名为 'quarter'
  3. 设置新值列名 (value_name)
    • 使用 value_name='sales' 为原始宽表中列对应的值命名为 'sales'
  4. 输出结果
    • 转换后的表是一个长表,其中每行包含 productquarter 和对应的 sales 值。

复杂度分析

  • 时间复杂度O(n * m),其中 n 是原始表的行数,m 是原始表的列数。melt() 需要遍历所有列并重构为行。
  • 空间复杂度O(n * m),需要分配内存存储转换后的长表,每个原始值会占用一行。

代码

import pandas as pd

def meltTable(report: pd.DataFrame) -> pd.DataFrame:
  return pd.melt(report, id_vars=['product'], var_name='quarter', value_name='sales')