Function Basics

Making Code Reusable — From Repetitive Tasks to Elegant Programming

What is a Function?

A function is a reusable code block that accepts input (parameters), performs operations, and returns output (results).

Analogy:

Stata: program define
R: function()
Mathematics: f(x) = 2x + 1

Why Do We Need Functions?

✅ Avoid code duplication (DRY principle: Don't Repeat Yourself)
✅ Make code easier to maintain
✅ Improve readability
✅ Facilitate testing and debugging

Defining Functions

Basic Syntax

python

def function_name(parameters):
    """Docstring (optional)"""
    # Function body
    return result

Example 1: Simple Function

python

def greet():
    """Print greeting"""
    print("Hello, World!")

# Call the function
greet()  # Output: Hello, World!

Example 2: Function with Parameters

python

def greet_person(name):
    """Print personalized greeting"""
    print(f"Hello, {name}!")

greet_person("Alice")  # Output: Hello, Alice!
greet_person("Bob")    # Output: Hello, Bob!

Example 3: Function with Return Value

python

def calculate_bmi(weight, height):
    """Calculate BMI

    Parameters:
        weight: Weight in kilograms
        height: Height in meters

    Returns:
        BMI value
    """
    bmi = weight / (height ** 2)
    return bmi

# Use the function
result = calculate_bmi(70, 1.75)
print(f"BMI: {result:.2f}")  # BMI: 22.86

Comparison: Stata vs R vs Python

Stata Version

stata

* Stata
program define calc_bmi
    args weight height
    gen bmi = `weight' / (`height'^2)
end

R Version

# R
calc_bmi <- function(weight, height) {
  bmi <- weight / (height^2)
  return(bmi)
}

result <- calc_bmi(70, 1.75)

Python Version

python

# Python
def calc_bmi(weight, height):
    bmi = weight / (height ** 2)
    return bmi

result = calc_bmi(70, 1.75)

Real-World Cases

Case 1: Income Tax Calculation

python

def calculate_tax(income):
    """Calculate personal income tax (progressive tax rate)

    Tax rates:
        0-50000: 10%
        50001-100000: 20%
        100001+: 30%
    """
    if income <= 50000:
        tax = income * 0.10
    elif income <= 100000:
        tax = 50000 * 0.10 + (income - 50000) * 0.20
    else:
        tax = 50000 * 0.10 + 50000 * 0.20 + (income - 100000) * 0.30

    return tax

# Usage
incomes = [45000, 75000, 120000]
for income in incomes:
    tax = calculate_tax(income)
    net = income - tax
    print(f"Income: ${income:,}, Tax: ${tax:,.0f}, After-tax: ${net:,.0f}")

Case 2: Data Validation

python

def validate_age(age):
    """Validate if age is reasonable

    Returns:
        (is_valid, message): Boolean value and message
    """
    if age < 0:
        return False, "Age cannot be negative"
    elif age > 120:
        return False, "Age too high"
    elif age < 18:
        return False, "Under 18"
    else:
        return True, "Valid age"

# Usage
ages = [25, -5, 150, 15, 30]
for age in ages:
    is_valid, message = validate_age(age)
    status = "✅" if is_valid else "❌"
    print(f"{status} Age {age}: {message}")

Case 3: Descriptive Statistics

python

def describe_data(data):
    """Calculate descriptive statistics

    Returns dictionary containing: n, mean, median, min, max
    """
    n = len(data)
    mean = sum(data) / n
    sorted_data = sorted(data)
    median = sorted_data[n // 2]

    return {
        'n': n,
        'mean': mean,
        'median': median,
        'min': min(data),
        'max': max(data)
    }

# Usage
scores = [85, 92, 78, 90, 88, 76, 95, 82]
stats = describe_data(scores)

print(f"Sample size: {stats['n']}")
print(f"Mean score: {stats['mean']:.2f}")
print(f"Median: {stats['median']}")
print(f"Lowest score: {stats['min']}")
print(f"Highest score: {stats['max']}")

Return Values

1. Return Single Value

python

def square(x):
    return x ** 2

result = square(5)  # 25

2. Return Multiple Values (Tuple)

python

def calculate_stats(data):
    mean = sum(data) / len(data)
    maximum = max(data)
    minimum = min(data)
    return mean, maximum, minimum  # Automatically packed as tuple

# Unpack on receiving
avg, max_val, min_val = calculate_stats([1, 2, 3, 4, 5])
print(avg, max_val, min_val)  # 3.0 5 1

3. Return Dictionary

python

def get_student_info(name, age, major):
    return {
        'name': name,
        'age': age,
        'major': major,
        'status': 'active'
    }

student = get_student_info("Alice", 25, "Economics")
print(student['name'])  # Alice

4. No Return Value

python

def print_report(data):
    """Only prints, doesn't return"""
    for item in data:
        print(item)
    # No return statement, defaults to returning None

result = print_report([1, 2, 3])
print(result)  # None

Function Parameter Types

1. Positional Parameters

python

def power(base, exponent):
    return base ** exponent

result = power(2, 3)  # 2^3 = 8

2. Default Parameters

python

def greet(name, greeting="Hello"):
    """greeting has a default value"""
    return f"{greeting}, {name}!"

print(greet("Alice"))              # Hello, Alice!
print(greet("Bob", "Hi"))          # Hi, Bob!
print(greet("Carol", greeting="Hey"))  # Hey, Carol!

3. Keyword Arguments

python

def register_student(name, age, major, gpa=3.0):
    print(f"{name}, {age} years old, Major: {major}, GPA: {gpa}")

# Positional arguments
register_student("Alice", 25, "Economics")

# Keyword arguments (order can vary)
register_student(major="Sociology", age=28, name="Bob", gpa=3.5)

# Mixed (positional arguments must come first)
register_student("Carol", 26, major="Political Science")

4. Variable Arguments (*args)

python

def calculate_average(*numbers):
    """Accept any number of numbers"""
    if not numbers:
        return 0
    return sum(numbers) / len(numbers)

print(calculate_average(1, 2, 3))           # 2.0
print(calculate_average(10, 20, 30, 40))    # 25.0
print(calculate_average(5))                 # 5.0

5. Variable Keyword Arguments (**kwargs)

python

def create_respondent(**info):
    """Accept any number of keyword arguments"""
    for key, value in info.items():
        print(f"{key}: {value}")

create_respondent(
    name="Alice",
    age=30,
    income=75000,
    education="Master's"
)

Best Practices

1. Function Naming

python

# ❌ Bad naming
def f(x, y):
    return x + y

# ✅ Good naming (verb-based, describes functionality)
def calculate_total(price, quantity):
    return price * quantity

def validate_email(email):
    return '@' in email

def is_adult(age):
    return age >= 18

2. Single Responsibility Principle

python

# ❌ Function does too many things
def process_data(data):
    # Clean data
    # Calculate statistics
    # Generate charts
    # Save results
    pass

# ✅ Split into multiple functions
def clean_data(data):
    """Only responsible for cleaning"""
    pass

def calculate_stats(data):
    """Only responsible for calculation"""
    pass

def plot_data(data):
    """Only responsible for plotting"""
    pass

3. Docstrings

python

def calculate_correlation(x, y):
    """Calculate correlation coefficient between two variables

    Parameters:
        x (list): Data for first variable
        y (list): Data for second variable

    Returns:
        float: Correlation coefficient (-1 to 1)

    Example:
        >>> calculate_correlation([1, 2, 3], [2, 4, 6])
        1.0
    """
    # Function implementation
    pass

Practice Exercises

Exercise 1: Temperature Conversion

python

# Write two functions:
# 1. celsius_to_fahrenheit(c): Celsius to Fahrenheit
# 2. fahrenheit_to_celsius(f): Fahrenheit to Celsius
# Formula: F = C × 9/5 + 32

Exercise 2: Grade Classification

python

# Write function grade_to_letter(score)
# 90-100: A
# 80-89: B
# 70-79: C
# 60-69: D
# <60: F

Exercise 3: List Statistics

python

# Write function analyze_scores(scores)
# Return dictionary containing:
# - count: Number of scores
# - mean: Average score
# - passing_rate: Pass rate (>=60)
# - grade_distribution: Count per grade level

scores = [85, 92, 78, 65, 90, 55, 88, 76, 95, 70]
result = analyze_scores(scores)

Next Steps

In the next section, we'll learn about advanced function parameter usage, including argument unpacking, lambda functions, and more.

Keep going!

Function Basics ​

What is a Function? ​

Defining Functions ​

Basic Syntax ​

Example 1: Simple Function ​

Example 2: Function with Parameters ​

Example 3: Function with Return Value ​

Comparison: Stata vs R vs Python ​

Stata Version ​

R Version ​

Python Version ​

Real-World Cases ​

Case 1: Income Tax Calculation ​

Case 2: Data Validation ​

Case 3: Descriptive Statistics ​

Return Values ​

1. Return Single Value ​

2. Return Multiple Values (Tuple) ​

3. Return Dictionary ​

4. No Return Value ​

Function Parameter Types ​

1. Positional Parameters ​

2. Default Parameters ​

3. Keyword Arguments ​

4. Variable Arguments (*args) ​

5. Variable Keyword Arguments (**kwargs) ​

Best Practices ​

1. Function Naming ​

2. Single Responsibility Principle ​

3. Docstrings ​

Practice Exercises ​

Exercise 1: Temperature Conversion ​

Exercise 2: Grade Classification ​

Exercise 3: List Statistics ​

Next Steps ​

Function Basics

What is a Function?

Defining Functions

Basic Syntax

Example 1: Simple Function

Example 2: Function with Parameters

Example 3: Function with Return Value

Comparison: Stata vs R vs Python

Stata Version

R Version

Python Version

Real-World Cases

Case 1: Income Tax Calculation

Case 2: Data Validation

Case 3: Descriptive Statistics

Return Values

1. Return Single Value

2. Return Multiple Values (Tuple)

3. Return Dictionary

4. No Return Value

Function Parameter Types

1. Positional Parameters

2. Default Parameters

3. Keyword Arguments

4. Variable Arguments (*args)

5. Variable Keyword Arguments (**kwargs)

Best Practices

1. Function Naming

2. Single Responsibility Principle

3. Docstrings

Practice Exercises

Exercise 1: Temperature Conversion

Exercise 2: Grade Classification

Exercise 3: List Statistics

Next Steps