Skip to content

Function Basics

Making Code Reusable — From Repetitive Tasks to Elegant Programming


What is a Function?

A function is a reusable code block that accepts input (parameters), performs operations, and returns output (results).

Analogy:

  • Stata: program define
  • R: function()
  • Mathematics: f(x) = 2x + 1

Why Do We Need Functions?

  • ✅ Avoid code duplication (DRY principle: Don't Repeat Yourself)
  • ✅ Make code easier to maintain
  • ✅ Improve readability
  • ✅ Facilitate testing and debugging

Defining Functions

Basic Syntax

python
def function_name(parameters):
    """Docstring (optional)"""
    # Function body
    return result

Example 1: Simple Function

python
def greet():
    """Print greeting"""
    print("Hello, World!")

# Call the function
greet()  # Output: Hello, World!

Example 2: Function with Parameters

python
def greet_person(name):
    """Print personalized greeting"""
    print(f"Hello, {name}!")

greet_person("Alice")  # Output: Hello, Alice!
greet_person("Bob")    # Output: Hello, Bob!

Example 3: Function with Return Value

python
def calculate_bmi(weight, height):
    """Calculate BMI

    Parameters:
        weight: Weight in kilograms
        height: Height in meters

    Returns:
        BMI value
    """
    bmi = weight / (height ** 2)
    return bmi

# Use the function
result = calculate_bmi(70, 1.75)
print(f"BMI: {result:.2f}")  # BMI: 22.86

Comparison: Stata vs R vs Python

Stata Version

stata
* Stata
program define calc_bmi
    args weight height
    gen bmi = `weight' / (`height'^2)
end

R Version

r
# R
calc_bmi <- function(weight, height) {
  bmi <- weight / (height^2)
  return(bmi)
}

result <- calc_bmi(70, 1.75)

Python Version

python
# Python
def calc_bmi(weight, height):
    bmi = weight / (height ** 2)
    return bmi

result = calc_bmi(70, 1.75)

Real-World Cases

Case 1: Income Tax Calculation

python
def calculate_tax(income):
    """Calculate personal income tax (progressive tax rate)

    Tax rates:
        0-50000: 10%
        50001-100000: 20%
        100001+: 30%
    """
    if income <= 50000:
        tax = income * 0.10
    elif income <= 100000:
        tax = 50000 * 0.10 + (income - 50000) * 0.20
    else:
        tax = 50000 * 0.10 + 50000 * 0.20 + (income - 100000) * 0.30

    return tax

# Usage
incomes = [45000, 75000, 120000]
for income in incomes:
    tax = calculate_tax(income)
    net = income - tax
    print(f"Income: ${income:,}, Tax: ${tax:,.0f}, After-tax: ${net:,.0f}")

Case 2: Data Validation

python
def validate_age(age):
    """Validate if age is reasonable

    Returns:
        (is_valid, message): Boolean value and message
    """
    if age < 0:
        return False, "Age cannot be negative"
    elif age > 120:
        return False, "Age too high"
    elif age < 18:
        return False, "Under 18"
    else:
        return True, "Valid age"

# Usage
ages = [25, -5, 150, 15, 30]
for age in ages:
    is_valid, message = validate_age(age)
    status = "✅" if is_valid else "❌"
    print(f"{status} Age {age}: {message}")

Case 3: Descriptive Statistics

python
def describe_data(data):
    """Calculate descriptive statistics

    Returns dictionary containing: n, mean, median, min, max
    """
    n = len(data)
    mean = sum(data) / n
    sorted_data = sorted(data)
    median = sorted_data[n // 2]

    return {
        'n': n,
        'mean': mean,
        'median': median,
        'min': min(data),
        'max': max(data)
    }

# Usage
scores = [85, 92, 78, 90, 88, 76, 95, 82]
stats = describe_data(scores)

print(f"Sample size: {stats['n']}")
print(f"Mean score: {stats['mean']:.2f}")
print(f"Median: {stats['median']}")
print(f"Lowest score: {stats['min']}")
print(f"Highest score: {stats['max']}")

Return Values

1. Return Single Value

python
def square(x):
    return x ** 2

result = square(5)  # 25

2. Return Multiple Values (Tuple)

python
def calculate_stats(data):
    mean = sum(data) / len(data)
    maximum = max(data)
    minimum = min(data)
    return mean, maximum, minimum  # Automatically packed as tuple

# Unpack on receiving
avg, max_val, min_val = calculate_stats([1, 2, 3, 4, 5])
print(avg, max_val, min_val)  # 3.0 5 1

3. Return Dictionary

python
def get_student_info(name, age, major):
    return {
        'name': name,
        'age': age,
        'major': major,
        'status': 'active'
    }

student = get_student_info("Alice", 25, "Economics")
print(student['name'])  # Alice

4. No Return Value

python
def print_report(data):
    """Only prints, doesn't return"""
    for item in data:
        print(item)
    # No return statement, defaults to returning None

result = print_report([1, 2, 3])
print(result)  # None

Function Parameter Types

1. Positional Parameters

python
def power(base, exponent):
    return base ** exponent

result = power(2, 3)  # 2^3 = 8

2. Default Parameters

python
def greet(name, greeting="Hello"):
    """greeting has a default value"""
    return f"{greeting}, {name}!"

print(greet("Alice"))              # Hello, Alice!
print(greet("Bob", "Hi"))          # Hi, Bob!
print(greet("Carol", greeting="Hey"))  # Hey, Carol!

3. Keyword Arguments

python
def register_student(name, age, major, gpa=3.0):
    print(f"{name}, {age} years old, Major: {major}, GPA: {gpa}")

# Positional arguments
register_student("Alice", 25, "Economics")

# Keyword arguments (order can vary)
register_student(major="Sociology", age=28, name="Bob", gpa=3.5)

# Mixed (positional arguments must come first)
register_student("Carol", 26, major="Political Science")

4. Variable Arguments (*args)

python
def calculate_average(*numbers):
    """Accept any number of numbers"""
    if not numbers:
        return 0
    return sum(numbers) / len(numbers)

print(calculate_average(1, 2, 3))           # 2.0
print(calculate_average(10, 20, 30, 40))    # 25.0
print(calculate_average(5))                 # 5.0

5. Variable Keyword Arguments (**kwargs)

python
def create_respondent(**info):
    """Accept any number of keyword arguments"""
    for key, value in info.items():
        print(f"{key}: {value}")

create_respondent(
    name="Alice",
    age=30,
    income=75000,
    education="Master's"
)

Best Practices

1. Function Naming

python
# ❌ Bad naming
def f(x, y):
    return x + y

# ✅ Good naming (verb-based, describes functionality)
def calculate_total(price, quantity):
    return price * quantity

def validate_email(email):
    return '@' in email

def is_adult(age):
    return age >= 18

2. Single Responsibility Principle

python
# ❌ Function does too many things
def process_data(data):
    # Clean data
    # Calculate statistics
    # Generate charts
    # Save results
    pass

# ✅ Split into multiple functions
def clean_data(data):
    """Only responsible for cleaning"""
    pass

def calculate_stats(data):
    """Only responsible for calculation"""
    pass

def plot_data(data):
    """Only responsible for plotting"""
    pass

3. Docstrings

python
def calculate_correlation(x, y):
    """Calculate correlation coefficient between two variables

    Parameters:
        x (list): Data for first variable
        y (list): Data for second variable

    Returns:
        float: Correlation coefficient (-1 to 1)

    Example:
        >>> calculate_correlation([1, 2, 3], [2, 4, 6])
        1.0
    """
    # Function implementation
    pass

Practice Exercises

Exercise 1: Temperature Conversion

python
# Write two functions:
# 1. celsius_to_fahrenheit(c): Celsius to Fahrenheit
# 2. fahrenheit_to_celsius(f): Fahrenheit to Celsius
# Formula: F = C × 9/5 + 32

Exercise 2: Grade Classification

python
# Write function grade_to_letter(score)
# 90-100: A
# 80-89: B
# 70-79: C
# 60-69: D
# <60: F

Exercise 3: List Statistics

python
# Write function analyze_scores(scores)
# Return dictionary containing:
# - count: Number of scores
# - mean: Average score
# - passing_rate: Pass rate (>=60)
# - grade_distribution: Count per grade level

scores = [85, 92, 78, 65, 90, 55, 88, 76, 95, 70]
result = analyze_scores(scores)

Next Steps

In the next section, we'll learn about advanced function parameter usage, including argument unpacking, lambda functions, and more.

Keep going!

Released under the MIT License. Content © Author.