Function Basics
Making Code Reusable — From Repetitive Tasks to Elegant Programming
What is a Function?
A function is a reusable code block that accepts input (parameters), performs operations, and returns output (results).
Analogy:
- Stata:
program define - R:
function() - Mathematics: f(x) = 2x + 1
Why Do We Need Functions?
- ✅ Avoid code duplication (DRY principle: Don't Repeat Yourself)
- ✅ Make code easier to maintain
- ✅ Improve readability
- ✅ Facilitate testing and debugging
Defining Functions
Basic Syntax
python
def function_name(parameters):
"""Docstring (optional)"""
# Function body
return resultExample 1: Simple Function
python
def greet():
"""Print greeting"""
print("Hello, World!")
# Call the function
greet() # Output: Hello, World!Example 2: Function with Parameters
python
def greet_person(name):
"""Print personalized greeting"""
print(f"Hello, {name}!")
greet_person("Alice") # Output: Hello, Alice!
greet_person("Bob") # Output: Hello, Bob!Example 3: Function with Return Value
python
def calculate_bmi(weight, height):
"""Calculate BMI
Parameters:
weight: Weight in kilograms
height: Height in meters
Returns:
BMI value
"""
bmi = weight / (height ** 2)
return bmi
# Use the function
result = calculate_bmi(70, 1.75)
print(f"BMI: {result:.2f}") # BMI: 22.86Comparison: Stata vs R vs Python
Stata Version
stata
* Stata
program define calc_bmi
args weight height
gen bmi = `weight' / (`height'^2)
endR Version
r
# R
calc_bmi <- function(weight, height) {
bmi <- weight / (height^2)
return(bmi)
}
result <- calc_bmi(70, 1.75)Python Version
python
# Python
def calc_bmi(weight, height):
bmi = weight / (height ** 2)
return bmi
result = calc_bmi(70, 1.75)Real-World Cases
Case 1: Income Tax Calculation
python
def calculate_tax(income):
"""Calculate personal income tax (progressive tax rate)
Tax rates:
0-50000: 10%
50001-100000: 20%
100001+: 30%
"""
if income <= 50000:
tax = income * 0.10
elif income <= 100000:
tax = 50000 * 0.10 + (income - 50000) * 0.20
else:
tax = 50000 * 0.10 + 50000 * 0.20 + (income - 100000) * 0.30
return tax
# Usage
incomes = [45000, 75000, 120000]
for income in incomes:
tax = calculate_tax(income)
net = income - tax
print(f"Income: ${income:,}, Tax: ${tax:,.0f}, After-tax: ${net:,.0f}")Case 2: Data Validation
python
def validate_age(age):
"""Validate if age is reasonable
Returns:
(is_valid, message): Boolean value and message
"""
if age < 0:
return False, "Age cannot be negative"
elif age > 120:
return False, "Age too high"
elif age < 18:
return False, "Under 18"
else:
return True, "Valid age"
# Usage
ages = [25, -5, 150, 15, 30]
for age in ages:
is_valid, message = validate_age(age)
status = "✅" if is_valid else "❌"
print(f"{status} Age {age}: {message}")Case 3: Descriptive Statistics
python
def describe_data(data):
"""Calculate descriptive statistics
Returns dictionary containing: n, mean, median, min, max
"""
n = len(data)
mean = sum(data) / n
sorted_data = sorted(data)
median = sorted_data[n // 2]
return {
'n': n,
'mean': mean,
'median': median,
'min': min(data),
'max': max(data)
}
# Usage
scores = [85, 92, 78, 90, 88, 76, 95, 82]
stats = describe_data(scores)
print(f"Sample size: {stats['n']}")
print(f"Mean score: {stats['mean']:.2f}")
print(f"Median: {stats['median']}")
print(f"Lowest score: {stats['min']}")
print(f"Highest score: {stats['max']}")Return Values
1. Return Single Value
python
def square(x):
return x ** 2
result = square(5) # 252. Return Multiple Values (Tuple)
python
def calculate_stats(data):
mean = sum(data) / len(data)
maximum = max(data)
minimum = min(data)
return mean, maximum, minimum # Automatically packed as tuple
# Unpack on receiving
avg, max_val, min_val = calculate_stats([1, 2, 3, 4, 5])
print(avg, max_val, min_val) # 3.0 5 13. Return Dictionary
python
def get_student_info(name, age, major):
return {
'name': name,
'age': age,
'major': major,
'status': 'active'
}
student = get_student_info("Alice", 25, "Economics")
print(student['name']) # Alice4. No Return Value
python
def print_report(data):
"""Only prints, doesn't return"""
for item in data:
print(item)
# No return statement, defaults to returning None
result = print_report([1, 2, 3])
print(result) # NoneFunction Parameter Types
1. Positional Parameters
python
def power(base, exponent):
return base ** exponent
result = power(2, 3) # 2^3 = 82. Default Parameters
python
def greet(name, greeting="Hello"):
"""greeting has a default value"""
return f"{greeting}, {name}!"
print(greet("Alice")) # Hello, Alice!
print(greet("Bob", "Hi")) # Hi, Bob!
print(greet("Carol", greeting="Hey")) # Hey, Carol!3. Keyword Arguments
python
def register_student(name, age, major, gpa=3.0):
print(f"{name}, {age} years old, Major: {major}, GPA: {gpa}")
# Positional arguments
register_student("Alice", 25, "Economics")
# Keyword arguments (order can vary)
register_student(major="Sociology", age=28, name="Bob", gpa=3.5)
# Mixed (positional arguments must come first)
register_student("Carol", 26, major="Political Science")4. Variable Arguments (*args)
python
def calculate_average(*numbers):
"""Accept any number of numbers"""
if not numbers:
return 0
return sum(numbers) / len(numbers)
print(calculate_average(1, 2, 3)) # 2.0
print(calculate_average(10, 20, 30, 40)) # 25.0
print(calculate_average(5)) # 5.05. Variable Keyword Arguments (**kwargs)
python
def create_respondent(**info):
"""Accept any number of keyword arguments"""
for key, value in info.items():
print(f"{key}: {value}")
create_respondent(
name="Alice",
age=30,
income=75000,
education="Master's"
)Best Practices
1. Function Naming
python
# ❌ Bad naming
def f(x, y):
return x + y
# ✅ Good naming (verb-based, describes functionality)
def calculate_total(price, quantity):
return price * quantity
def validate_email(email):
return '@' in email
def is_adult(age):
return age >= 182. Single Responsibility Principle
python
# ❌ Function does too many things
def process_data(data):
# Clean data
# Calculate statistics
# Generate charts
# Save results
pass
# ✅ Split into multiple functions
def clean_data(data):
"""Only responsible for cleaning"""
pass
def calculate_stats(data):
"""Only responsible for calculation"""
pass
def plot_data(data):
"""Only responsible for plotting"""
pass3. Docstrings
python
def calculate_correlation(x, y):
"""Calculate correlation coefficient between two variables
Parameters:
x (list): Data for first variable
y (list): Data for second variable
Returns:
float: Correlation coefficient (-1 to 1)
Example:
>>> calculate_correlation([1, 2, 3], [2, 4, 6])
1.0
"""
# Function implementation
passPractice Exercises
Exercise 1: Temperature Conversion
python
# Write two functions:
# 1. celsius_to_fahrenheit(c): Celsius to Fahrenheit
# 2. fahrenheit_to_celsius(f): Fahrenheit to Celsius
# Formula: F = C × 9/5 + 32Exercise 2: Grade Classification
python
# Write function grade_to_letter(score)
# 90-100: A
# 80-89: B
# 70-79: C
# 60-69: D
# <60: FExercise 3: List Statistics
python
# Write function analyze_scores(scores)
# Return dictionary containing:
# - count: Number of scores
# - mean: Average score
# - passing_rate: Pass rate (>=60)
# - grade_distribution: Count per grade level
scores = [85, 92, 78, 65, 90, 55, 88, 76, 95, 70]
result = analyze_scores(scores)Next Steps
In the next section, we'll learn about advanced function parameter usage, including argument unpacking, lambda functions, and more.
Keep going!