Module 3: Python Basic Syntax
Learning Programming from Scratch — Variables, Operations, Conditionals, and Loops
Module Overview
The essence of programming is expressing logic through code. This module will help you master Python's core syntax, from the most basic variables and data types to conditional statements and loop control. These are the building blocks of all programs and essential foundations for subsequent data processing and statistical analysis.
Learning Objectives
After completing this module, you will be able to:
- Understand the essence of variables and proficiently use Python data types
- Master arithmetic, comparison, and logical operators
- Use conditional statements (
if-elif-else) to implement decision logic - Use loops (
for,while) to handle repetitive tasks - Understand syntax differences between Python and Stata/R
- Write simple data processing and analysis programs
Module Contents
01 - Variables and Data Types
Core Question: How do you store and manipulate data in Python?
Core Content:
- Variable creation and naming conventions
- Five basic data types:
- Integers (int): age, population, year
- Floats (float): income, GDP, interest rate
- Strings (str): name, region, text
- Booleans (bool): employment status, treatment group indicator
- None: representing missing values
- Type conversion (
int(),float(),str()) - Type checking (
type(),isinstance()) - Comparison with Stata/R variable types
Practical Application:
# Research scenario: education returns
years_edu = 16 # years of education (integer)
income = 75000.50 # annual income (float)
name = "Alice" # name (string)
is_employed = True # employment status (boolean)02 - Operators
Core Question: How do you perform mathematical calculations and logical judgments?
Core Content:
- Arithmetic operators:
+,-,*,/,//,%,** - Comparison operators:
==,!=,>,<,>=,<= - Logical operators:
and,or,not - Assignment operators:
=,+=,-=,*=,/= - Membership operators:
in,not in - Operator precedence
Practical Application:
# Calculate education returns
income_college = 75000
income_high_school = 50000
return_to_edu = (income_college - income_high_school) / income_high_school * 100
print(f"Return to education: {return_to_edu:.1f}%") # 50.0%
# Identify high earners
is_high_earner = income_college > 70000 and is_employed03 - Conditional Statements
Core Question: How do you make programs execute different actions based on conditions?
Core Content:
ifstatement: single conditionif-elsestatement: binary choice logicif-elif-elsestatement: multiple conditional branches- Nested conditional statements
- Conditional expressions (ternary operator)
- Comparison with Stata (
if/else) and R (if/ifelse)
Practical Application:
# Income grouping
if income < 30000:
income_group = "Low income"
elif income < 80000:
income_group = "Middle income"
else:
income_group = "High income"
# Policy intervention eligibility
eligible = "Qualified" if income < 50000 and is_employed else "Not qualified"Research Scenarios:
- Group statistics (low/middle/high income)
- Policy intervention eligibility screening
- Outlier flagging
- Dummy variable generation
04 - Loops
Core Question: How do you efficiently handle repetitive tasks?
Core Content:
forloop: iterate through sequences (lists, ranges, strings)whileloop: condition-based repetition- Loop control:
break(exit),continue(skip),else(normal completion) - Nested loops: processing multidimensional data
- List comprehensions: concise loop syntax
- Comparison with Stata (
forvalues/foreach) and R (for/apply)
Practical Application:
# Calculate income for multiple education levels
edu_levels = [12, 14, 16, 18, 20]
for years in edu_levels:
income = 30000 + 5000 * years
print(f"{years} years of education → Income: ${income:,}")
# Data cleaning: remove negative incomes
incomes = [50000, -100, 75000, -500, 60000]
clean_incomes = [x for x in incomes if x > 0]Research Scenarios:
- Batch data processing
- Monte Carlo simulation
- Robustness checks (multiple models)
- Bootstrap sampling
05 - Summary and Review
Content:
- Core concept review
- Comprehensive practice exercises
- Stata/R/Python syntax comparison table
- Common errors and debugging tips
- Next steps learning suggestions
Python vs Stata vs R Syntax Comparison
Variable Creation
| Operation | Python | Stata | R |
|---|---|---|---|
| Create variable | age = 25 | gen age = 25 | age <- 25 |
| Modify variable | age = 30 | replace age = 30 | age <- 30 |
| Check type | type(age) | describe age | class(age) |
Conditional Statements
| Operation | Python | Stata | R |
|---|---|---|---|
| Single condition | if x > 0: | if x > 0 { | if (x > 0) { |
| Multiple conditions | if/elif/else | if/else if/else | if/else if/else |
| Conditional assignment | y = 1 if x>0 else 0 | gen y = (x>0) | y <- ifelse(x>0, 1, 0) |
Loops
| Operation | Python | Stata | R |
|---|---|---|---|
| Numeric loop | for i in range(10): | forvalues i = 1/10 { | for (i in 1:10) { |
| List loop | for x in list: | foreach x in list { | for (x in list) { |
| Conditional loop | while x < 10: | while x < 10 { | while (x < 10) { |
How to Study This Module?
Learning Roadmap
Day 1 (2 hours): Variables and Data Types
- Read 01 - Variables and Data Types
- Run all code examples
- Try modifying variable values and types
Day 2 (2 hours): Operators
- Read 02 - Operators
- Practice arithmetic, comparison, and logical operations
- Complete practice exercises
Day 3 (3 hours): Conditional Statements
- Read 03 - Conditional Statements
- Write income grouping programs
- Practice nested conditions
Day 4 (3 hours): Loops
- Read 04 - Loops
- Use loops to process data
- Learn list comprehensions
Day 5 (2 hours): Review and Practice
- Complete 05 - Summary and Review
- Comprehensive practice exercises
- Compare Stata/R code
Total Time: 12 hours (1-2 weeks)
Minimalist Learning Path
If time is limited, prioritize:
Must-Learn (core syntax, 6 hours):
- 01 - Variables and Data Types
- 02 - Operators (basics)
- 03 - Conditional Statements (if-elif-else)
- 04 - Loops (for loops + list comprehensions)
Optional (advanced techniques):
- while loops
- Nested loops
- Complex conditional expressions
Learning Recommendations
Hands-on Practice > Reading Comprehension
- Run every code example in Jupyter
- Modify parameters and observe output changes
- Try writing similar code yourself
Comparative Learning
- Compare Python code with Stata/R
- Understand the logic behind syntax differences
- Reproduce previous Stata analyses in Python
Progressive Learning
- Master basic syntax first, then learn advanced techniques
- Don't skip practice exercises
- Check documentation first when encountering difficulties, then seek community help
Build Intuition
- Understanding "why" is more important than memorizing "how"
- Think about the logic behind the code
- Connect programming concepts with research scenarios
Common Questions
Q: Why does Python's indexing start from 0? A: This is a computer science tradition. While it takes some getting used to initially, you'll adapt quickly. Remember: list[0] is the first element.
Q: What's the purpose of indentation? A: Python uses indentation to indicate code blocks, rather than {} or end. This enforces clean code formatting, but be careful not to mix spaces and tabs.
Q: Why is = for assignment and == for comparison? A: = assigns the value on the right to the variable on the left, while == checks if both sides are equal. This is standard usage across all programming languages.
Q: List comprehensions seem too difficult, can I skip them? A: Initially you can use regular for loops, but list comprehensions make code more concise. I recommend mastering basic loops first, then coming back to learn comprehensions.
Q: Can I do data analysis after completing this module? A: Not quite yet. This module only covers syntax basics; actual data analysis requires learning Pandas (Module 6-7). However, this syntax is a necessary foundation for later learning.
Next Steps
After completing this module, you will have mastered:
- Python's basic syntax and programming logic
- Using variables, operators, conditionals, and loops
- Syntax comparison with Stata/R
In Module 4, we will learn data structures (lists, dictionaries, tuples, sets) to prepare for handling complex data.
In Modules 6-7, we will learn Pandas and begin real data analysis work!
Keep going! Mastering these basic syntax elements puts you well on the way to becoming a Python data analyst!