2. Programming

Python Basics

Core Python syntax, control flow, functions, and idiomatic constructs required to manipulate data and implement analytical logic effectively.

Python Basics

Hey students! šŸ‘‹ Welcome to the exciting world of Python programming! This lesson will introduce you to the fundamental building blocks of Python - one of the most popular programming languages in data science. By the end of this lesson, you'll understand core Python syntax, control structures, and functions that form the foundation for data manipulation and analysis. Think of this as learning the alphabet before writing stories - these basics will empower you to tackle complex data science challenges with confidence! šŸš€

Understanding Python Syntax and Data Types

Python is known for its clean, readable syntax that resembles natural English. Unlike many programming languages that use curly braces {}, Python uses indentation to define code blocks, making it incredibly beginner-friendly.

Let's start with variables - containers that store data values. In Python, you don't need to declare variable types explicitly:

name = "Alice"           # String
age = 17                 # Integer
height = 5.6             # Float
is_student = True        # Boolean

Python has several built-in data types that are essential for data science work:

Strings are sequences of characters used for text data. They're incredibly versatile - you can slice them, concatenate them, and format them. For example, if you're analyzing customer feedback, strings help you process text responses.

Numbers come in two main flavors: integers (whole numbers) and floats (decimal numbers). In data science, you'll use these constantly for calculations, statistics, and measurements. The average human body temperature is 98.6°F - that's a float!

Booleans represent True/False values and are crucial for logical operations. When filtering datasets (like finding students with grades above 90%), Boolean logic determines which records to include.

Lists are ordered collections that can store multiple items: grades = [85, 92, 78, 96, 88]. They're perfect for storing sequences of data points, like daily temperatures or stock prices.

Dictionaries store key-value pairs: student = {"name": "John", "grade": 95, "subject": "Math"}. Think of them as digital filing cabinets - perfect for organizing related information about entities in your datasets.

Control Flow: Making Decisions and Repeating Actions

Control flow statements are like traffic lights for your code - they determine which path your program takes based on conditions.

Conditional statements use if, elif, and else to make decisions:

temperature = 75
if temperature > 80:
    print("It's hot! šŸŒž")
elif temperature > 60:
    print("Perfect weather! 😊")
else:
    print("It's cold! 🧄")

This is incredibly useful in data analysis. Imagine you're analyzing student performance - you could automatically categorize grades as "Excellent" (90+), "Good" (80-89), "Average" (70-79), or "Needs Improvement" (below 70).

Loops help you repeat actions efficiently. The for loop iterates through sequences:

scores = [85, 92, 78, 96, 88]
total = 0
for score in scores:
    total += score
average = total / len(scores)

The while loop continues until a condition becomes false:

count = 1
while count <= 5:
    print(f"Iteration {count}")
    count += 1

In data science, loops are essential for processing large datasets. Netflix processes billions of viewing records daily using similar iterative approaches to generate personalized recommendations! šŸ“ŗ

Functions: Building Reusable Code Blocks

Functions are like recipes - they take ingredients (parameters), follow steps (code), and produce a result (return value). They're fundamental to writing clean, maintainable code.

Here's a simple function that calculates the area of a circle:

def calculate_circle_area(radius):
    pi = 3.14159
    area = pi * radius ** 2
    return area

# Using the function
circle_area = calculate_circle_area(5)
print(f"Area: {circle_area}")

Functions become incredibly powerful in data science. You might create a function to clean messy data, calculate statistical measures, or generate visualizations. Google's search algorithm uses thousands of functions working together to rank web pages and deliver relevant results in milliseconds! šŸ”

Parameters and Arguments allow functions to be flexible. You can have default parameters:

def greet_student(name, grade_level="high school"):
    return f"Hello {name}, welcome to {grade_level}!"

print(greet_student("students"))  # Uses default
print(greet_student("students", "college"))  # Overrides default

Return statements send results back to the calling code. Functions can return multiple values using tuples:

def analyze_scores(scores):
    minimum = min(scores)
    maximum = max(scores)
    average = sum(scores) / len(scores)
    return minimum, maximum, average

min_score, max_score, avg_score = analyze_scores([85, 92, 78, 96, 88])

Working with Collections and Data Structures

Python's built-in collections are the workhorses of data manipulation. Understanding them deeply will make you incredibly efficient at processing information.

Lists are mutable (changeable) and ordered. You can add, remove, and modify elements:

daily_temperatures = [72, 75, 68, 80, 77]
daily_temperatures.append(82)  # Add new temperature
daily_temperatures[0] = 73     # Update first day
hottest_day = max(daily_temperatures)

List comprehensions provide a concise way to create lists:

# Convert Celsius to Fahrenheit
celsius_temps = [20, 25, 30, 35]
fahrenheit_temps = [c * 9/5 + 32 for c in celsius_temps]

Dictionaries excel at organizing related data. Social media platforms use dictionary-like structures to store user profiles, with keys like "username," "followers," and "posts."

student_record = {
    "name": "Sarah",
    "subjects": ["Math", "Science", "English"],
    "grades": {"Math": 95, "Science": 88, "English": 92}
}

String manipulation is crucial for text data processing. You'll often need to clean, format, and analyze textual information:

feedback = "  This product is AMAZING!!!  "
cleaned = feedback.strip().lower().replace("!", "")
word_count = len(cleaned.split())

Conclusion

Congratulations students! šŸŽ‰ You've just learned the essential building blocks of Python programming. We covered Python's clean syntax and fundamental data types (strings, numbers, booleans, lists, and dictionaries), explored control flow with conditionals and loops, mastered functions for creating reusable code, and discovered how to work with collections effectively. These concepts form the foundation of all data science work - from cleaning messy datasets to building machine learning models. With these tools in your toolkit, you're ready to start your journey into the fascinating world of data analysis and discovery!

Study Notes

• Variables: Store data without declaring types - name = "Alice", age = 17

• Data Types: Strings (text), integers (whole numbers), floats (decimals), booleans (True/False)

• Lists: Ordered, mutable collections - scores = [85, 92, 78]

• Dictionaries: Key-value pairs - student = {"name": "John", "grade": 95}

• If Statements: Make decisions with if, elif, else

• For Loops: Iterate through sequences - for item in list:

• While Loops: Repeat until condition is false - while condition:

• Functions: Reusable code blocks - def function_name(parameters):

• Return Values: Send results back with return

• List Comprehensions: Concise list creation - [x*2 for x in numbers]

• String Methods: .strip(), .lower(), .replace(), .split()

• Indentation: Python uses spaces/tabs to define code blocks (4 spaces recommended)

• Comments: Use # for single-line comments to document code

Practice Quiz

5 questions to test your understanding

Python Basics — Data Science | A-Warded