PyTorch/TensorFlow 深度学习快速入门
神经网络的简单应用
什么是深度学习?
深度学习 = 多层神经网络
社科应用场景:
- 文本分类(情感分析、主题分类)
- 图像分析(社交媒体图片)
- 序列预测(时间序列)
注意:社科研究中,传统机器学习(sklearn)通常已足够!
PyTorch 基础
安装
bash
pip install torch简单神经网络
python
import torch
import torch.nn as nn
import torch.optim as optim
# 定义模型
class SimpleNN(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(input_size, hidden_size)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(hidden_size, output_size)
def forward(self, x):
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x
# 创建模型
model = SimpleNN(input_size=10, hidden_size=20, output_size=1)
# 损失函数和优化器
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# 训练循环(简化)
for epoch in range(100):
# 前向传播
outputs = model(X_train)
loss = criterion(outputs, y_train)
# 反向传播
optimizer.zero_grad()
loss.backward()
optimizer.step()
# 预测
model.eval()
with torch.no_grad():
predictions = model(X_test)TensorFlow/Keras
更简洁的 API
python
import tensorflow as tf
from tensorflow import keras
# 定义模型
model = keras.Sequential([
keras.layers.Dense(20, activation='relu', input_shape=(10,)),
keras.layers.Dense(1)
])
# 编译
model.compile(
optimizer='adam',
loss='mse',
metrics=['mae']
)
# 训练
history = model.fit(
X_train, y_train,
epochs=100,
batch_size=32,
validation_split=0.2,
verbose=0
)
# 预测
predictions = model.predict(X_test)实战:文本情感分类
python
from tensorflow import keras
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
# 文本数据
texts = [
"This product is amazing!",
"Terrible experience, would not recommend.",
"Pretty good overall.",
# ...
]
labels = [1, 0, 1, ...] # 1=正面, 0=负面
# 文本向量化
tokenizer = Tokenizer(num_words=10000)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
X = pad_sequences(sequences, maxlen=100)
# 模型
model = keras.Sequential([
keras.layers.Embedding(10000, 16, input_length=100),
keras.layers.GlobalAveragePooling1D(),
keras.layers.Dense(16, activation='relu'),
keras.layers.Dense(1, activation='sigmoid')
])
model.compile(
optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy']
)
# 训练
model.fit(X, labels, epochs=10, batch_size=32)何时使用深度学习?
适合深度学习
- 大规模数据(>10万样本)
- 图像、文本、音频
- 复杂的非线性关系
不适合深度学习
- 小样本(<1000)
- 表格数据(用 sklearn 更好)
- 需要可解释性
社科建议:优先用 sklearn,除非有特殊需求!
练习题
python
# (可选)尝试使用 PyTorch 或 TensorFlow:
# 1. 构建简单的回归模型
# 2. 训练并评估
# 3. 与 sklearn 结果对比下一步
下一节:LLM API 快速使用(最实用!)
继续!