6.6 学术发表级图表（Publication-Ready Figures）

"Above all else show the data.""最重要的是展示数据。"— Edward Tufte, Data Visualization Pioneer (数据可视化先驱)

从代码到论文：制作符合发表标准的专业图表

本节目标

完成本节后，你将能够：

设置符合期刊要求的图表尺寸和分辨率
配置专业的字体和标签
创建多子图布局
导出高质量图片（PNG, PDF, SVG）
理解主流期刊的图表规范

图表尺寸和分辨率

常见期刊要求

期刊	宽度要求	DPI	格式
Nature	单栏: 89mm, 双栏: 183mm	300+	TIFF, EPS
Science	单栏: 9cm, 双栏: 18cm	300+	PDF, EPS
PNAS	单栏: 8.7cm, 双栏: 17.8cm	300+	TIFF, PDF
AER	单栏: 3.5in, 双栏: 7in	300+	EPS, PDF

设置图表尺寸

python

import matplotlib.pyplot as plt
import numpy as np

# 方法 1：创建图形时指定
fig, ax = plt.subplots(figsize=(7, 5))  # 英寸
ax.plot([1, 2, 3], [1, 4, 9])
plt.tight_layout()

# 方法 2：使用 rcParams 全局设置
plt.rcParams['figure.figsize'] = (7, 5)
plt.rcParams['figure.dpi'] = 300  # 屏幕显示 DPI
plt.rcParams['savefig.dpi'] = 600  # 保存时 DPI

# 常用尺寸（英寸）
SIZES = {
    'nature_single': (3.5, 2.625),  # 89mm ≈ 3.5in, 4:3 比例
    'nature_double': (7.2, 5.4),    # 183mm ≈ 7.2in
    'science_single': (3.54, 2.66),
    'science_double': (7.08, 5.31),
    'pnas_single': (3.43, 2.57),
    'pnas_double': (7.01, 5.26)
}

# 使用预设尺寸
fig, ax = plt.subplots(figsize=SIZES['nature_single'])

字体和样式

设置专业字体

python

# 方法 1：使用 rcParams
plt.rcParams.update({
    'font.family': 'sans-serif',
    'font.sans-serif': ['Arial', 'Helvetica'],
    'font.size': 10,
    'axes.labelsize': 10,
    'axes.titlesize': 12,
    'xtick.labelsize': 9,
    'ytick.labelsize': 9,
    'legend.fontsize': 9,
    'figure.titlesize': 12
})

# 方法 2：使用 LaTeX 渲染（更专业）
plt.rcParams.update({
    'text.usetex': True,
    'font.family': 'serif',
    'font.serif': ['Times New Roman'],
    'font.size': 10
})

# 示例
fig, ax = plt.subplots(figsize=(7, 5))
ax.plot([1, 2, 3], [1, 4, 9], linewidth=2)
ax.set_xlabel(r'$x$ (unit)', fontsize=12)
ax.set_ylabel(r'$y = x^2$', fontsize=12)
ax.set_title('Professional Figure', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

移除图表杂乱元素

python

import seaborn as sns

# 设置 seaborn 样式（论文友好）
sns.set_style("whitegrid")
sns.set_context("paper", font_scale=1.2)

# 或手动设置最简风格
plt.rcParams.update({
    'axes.spines.top': False,
    'axes.spines.right': False,
    'axes.grid': True,
    'axes.grid.axis': 'y',
    'grid.alpha': 0.3,
    'legend.frameon': False
})

多子图布局

方法 1：GridSpec（推荐）

python

from matplotlib.gridspec import GridSpec

fig = plt.figure(figsize=(12, 8))
gs = GridSpec(3, 3, figure=fig, hspace=0.3, wspace=0.3)

# 不同大小的子图
ax1 = fig.add_subplot(gs[0, :])   # 第一行，占满
ax2 = fig.add_subplot(gs[1, :2])  # 第二行，前两列
ax3 = fig.add_subplot(gs[1, 2])   # 第二行，第三列
ax4 = fig.add_subplot(gs[2, 0])   # 第三行，第一列
ax5 = fig.add_subplot(gs[2, 1:])  # 第三行，后两列

# 绘制内容...
for ax, label in zip([ax1, ax2, ax3, ax4, ax5], ['A', 'B', 'C', 'D', 'E']):
    ax.text(0.05, 0.95, label, transform=ax.transAxes, fontsize=16,
           fontweight='bold', va='top')

plt.suptitle('Multi-Panel Figure', fontsize=16, fontweight='bold')
plt.show()

方法 2：子图标签（A, B, C...）

python

fig, axes = plt.subplots(2, 2, figsize=(10, 8))
axes = axes.ravel()

for i, ax in enumerate(axes):
    # 绘制内容
    ax.plot(np.random.randn(100).cumsum())

    # 添加面板标签
    ax.text(-0.1, 1.05, chr(65+i),  # A, B, C, D
           transform=ax.transAxes,
           fontsize=16, fontweight='bold', va='top')

plt.tight_layout()
plt.show()

导出高质量图片

基本导出

python

# PNG（适合在线查看）
plt.savefig('figure1.png', dpi=300, bbox_inches='tight')

# PDF（适合论文投稿，矢量格式）
plt.savefig('figure1.pdf', bbox_inches='tight')

# SVG（适合后期编辑）
plt.savefig('figure1.svg', bbox_inches='tight')

# EPS（部分期刊要求）
plt.savefig('figure1.eps', bbox_inches='tight')

高级导出选项

python

# 完整参数
plt.savefig(
    'figure1.pdf',
    dpi=600,                    # 高分辨率
    bbox_inches='tight',        # 自动裁剪白边
    pad_inches=0.1,            # 保留边距
    transparent=True,          # 透明背景
    facecolor='white',         # 背景色
    edgecolor='none',          # 无边框
    format='pdf',              # 格式
    metadata={                 # 元数据
        'Creator': 'Python matplotlib',
        'Author': 'Your Name',
        'Title': 'Figure 1'
    }
)

完整案例：论文级图表

python

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm

# 全局设置
plt.rcParams.update({
    'font.family': 'sans-serif',
    'font.sans-serif': ['Arial'],
    'font.size': 10,
    'axes.labelsize': 11,
    'axes.titlesize': 12,
    'xtick.labelsize': 10,
    'ytick.labelsize': 10,
    'legend.fontsize': 9,
    'axes.spines.top': False,
    'axes.spines.right': False,
    'axes.grid': True,
    'axes.grid.axis': 'y',
    'grid.alpha': 0.3,
    'legend.frameon': False
})

# 生成数据
np.random.seed(42)
n = 500
education = np.random.normal(13, 3, n)
experience = np.random.uniform(0, 30, n)
female = np.random.binomial(1, 0.5, n)
log_wage = (1.5 + 0.08*education + 0.03*experience - 0.0005*experience**2
            - 0.15*female + np.random.normal(0, 0.3, n))
wage = np.exp(log_wage)

df = pd.DataFrame({
    'wage': wage,
    'log_wage': log_wage,
    'education': education,
    'experience': experience,
    'female': female,
    'gender': ['Female' if f else 'Male' for f in female]
})

# 创建论文级多面板图
fig = plt.figure(figsize=(12, 10))
gs = GridSpec(3, 2, figure=fig, hspace=0.35, wspace=0.3)

# Panel A: Distribution
ax1 = fig.add_subplot(gs[0, 0])
ax1.hist(df['wage'], bins=30, edgecolor='black', alpha=0.7, color='steelblue')
ax1.set_xlabel('Wage (thousand yuan/month)')
ax1.set_ylabel('Frequency')
ax1.set_title('Distribution of Wages')
ax1.text(-0.15, 1.05, 'A', transform=ax1.transAxes, fontsize=14, fontweight='bold')

# Panel B: Scatter plot with regression
ax2 = fig.add_subplot(gs[0, 1])
for gender, color, marker in [('Male', 'blue', 'o'), ('Female', 'red', 's')]:
    mask = df['gender'] == gender
    ax2.scatter(df.loc[mask, 'education'], df.loc[mask, 'log_wage'],
               alpha=0.4, s=30, color=color, marker=marker, label=gender)
ax2.set_xlabel('Education (years)')
ax2.set_ylabel('log(Wage)')
ax2.set_title('Education-Wage Relationship by Gender')
ax2.legend(loc='upper left')
ax2.text(-0.15, 1.05, 'B', transform=ax2.transAxes, fontsize=14, fontweight='bold')

# Panel C: Box plot
ax3 = fig.add_subplot(gs[1, 0])
sns.boxplot(x='gender', y='wage', data=df, ax=ax3, palette=['blue', 'red'])
ax3.set_xlabel('Gender')
ax3.set_ylabel('Wage (thousand yuan/month)')
ax3.set_title('Wage Distribution by Gender')
ax3.text(-0.15, 1.05, 'C', transform=ax3.transAxes, fontsize=14, fontweight='bold')

# Panel D: Regression diagnostics
ax4 = fig.add_subplot(gs[1, 1])
X = sm.add_constant(df[['education', 'experience', 'female']])
model = sm.OLS(df['log_wage'], X).fit()
ax4.scatter(model.fittedvalues, model.resid, alpha=0.5, s=30)
ax4.axhline(y=0, color='r', linestyle='--', linewidth=2)
ax4.set_xlabel('Fitted values')
ax4.set_ylabel('Residuals')
ax4.set_title('Residual Plot')
ax4.text(-0.15, 1.05, 'D', transform=ax4.transAxes, fontsize=14, fontweight='bold')

# Panel E: Coefficient plot
ax5 = fig.add_subplot(gs[2, :])
coefs = model.params[1:]  # 排除常数项
ci = model.conf_int()[1:]
y_pos = np.arange(len(coefs))
ax5.errorbar(coefs, y_pos, xerr=[coefs - ci[0], ci[1] - coefs],
            fmt='o', markersize=6, capsize=4, capthick=1.5, linewidth=1.5)
ax5.axvline(x=0, color='red', linestyle='--', linewidth=1.5, alpha=0.7)
ax5.set_yticks(y_pos)
ax5.set_yticklabels(['Education', 'Experience', 'Female'])
ax5.set_xlabel('Coefficient estimate')
ax5.set_title('Regression Coefficients with 95% CI')
ax5.text(-0.08, 1.05, 'E', transform=ax5.transAxes, fontsize=14, fontweight='bold')

# 添加图注
fig.text(0.5, 0.01, 'Figure 1. Wage Determination Analysis. '
         'Panel A shows the distribution of wages. Panel B displays education-wage relationships by gender. '
         'Panel C compares wage distributions. Panel D shows regression residuals. '
         'Panel E presents coefficient estimates. N=500.',
         ha='center', fontsize=9, wrap=True)

plt.savefig('figure1_publication.pdf', dpi=600, bbox_inches='tight')
plt.savefig('figure1_publication.png', dpi=300, bbox_inches='tight')
plt.show()

print(" Figures saved: figure1_publication.pdf and figure1_publication.png")

本节小结

发表图表检查清单

文件要求：

[ ] 分辨率 ≥ 300 DPI
[ ] 使用矢量格式（PDF/EPS/SVG）
[ ] 文件大小合理（< 10 MB）

视觉要求：

[ ] 字体清晰易读（≥ 9pt）
[ ] 线条粗细适中（≥ 1pt）
[ ] 颜色色盲友好
[ ] 去除不必要的元素

内容要求：

[ ] 标题简洁明了
[ ] 坐标轴有单位
[ ] 图例完整
[ ] 多子图有标签（A, B, C...）
[ ] 图注详细（Figure caption）

常见错误

避免：

图片分辨率过低
使用 RGB 颜色（印刷需要 CMYK）
字体过小（< 8pt）
图表过于复杂（一图多意）
彩色图在黑白打印时无法区分

延伸资源

可视化指南

Nature Figure Guidelines: https://www.nature.com/nature/for-authors/final-submission
Ten Simple Rules for Better Figures (Rougier et al., 2014)
Matplotlib Cheatsheets: https://github.com/matplotlib/cheatsheets

Python 工具

python

# 推荐的可视化包
pip install matplotlib seaborn plotly
pip install adjustText  # 自动调整标签位置
pip install scienceplots  # 科研风格主题

恭喜！Module 6 完成！

你已经掌握了：

单变量和双变量可视化
回归分析可视化
分布比较方法
学术发表级图表制作

下一步：将这些技能应用到你的研究中，让数据讲述精彩的故事！

继续探索 StatsPai 的其他模块！

6.6 学术发表级图表（Publication-Ready Figures） ​

本节目标 ​

图表尺寸和分辨率 ​

常见期刊要求 ​

设置图表尺寸 ​

字体和样式 ​

设置专业字体 ​

移除图表杂乱元素 ​

多子图布局 ​

方法 1：GridSpec（推荐） ​

方法 2：子图标签（A, B, C...） ​

导出高质量图片 ​

基本导出 ​

高级导出选项 ​

完整案例：论文级图表 ​

本节小结 ​

发表图表检查清单 ​

常见错误 ​

延伸资源 ​

可视化指南 ​

Python 工具 ​

恭喜！Module 6 完成！ ​

6.6 学术发表级图表（Publication-Ready Figures）

本节目标

图表尺寸和分辨率

常见期刊要求

设置图表尺寸

字体和样式

设置专业字体

移除图表杂乱元素

多子图布局

方法 1：GridSpec（推荐）

方法 2：子图标签（A, B, C...）

导出高质量图片

基本导出

高级导出选项

完整案例：论文级图表

本节小结

发表图表检查清单

常见错误

延伸资源

可视化指南

Python 工具

恭喜！Module 6 完成！