Welcome, fellow devs! Whether you're just stepping into the world of Python or brushing up your skills, this guide is designed to give you a hands-on, practical experience with real-world Python features — from handling massive data files to writing clean, efficient, and reusable code.
Let's dive right in!
Part 1: File Handling in Python
1. Basic File Operations
Python makes file operations a breeze using the built-in open()
function. Here's a simple way to open, read, and close a file:
file = open("sample.txt", "r")
content = file.read()
file.close()
But there's a better way — enter the with
statement:
with open("sample.txt", "r") as file:
content = file.read()
Why use with
?
- It ensures the file is closed automatically
- Prevents memory leaks and file locks
- Cleaner and more Pythonic
2. Reading and Writing Files
Reading a file line-by-line:
with open("sample.txt", "r") as file:
lines = file.readlines()
Writing to a file:
with open("output.txt", "w") as file:
file.write("Hello, world!")
Appending to a file:
with open("output.txt", "a") as file:
file.write("\nNew line added!")
3. Handling Large Files Efficiently
Trying to load a massive file all at once? ❌ Not ideal.
Instead, use these efficient techniques:
Reading line-by-line (streaming):
with open("large_file.txt", "r") as file:
for line in file:
print(line.strip())
Reading in chunks:
with open("large_file.txt", "r") as file:
while chunk := file.read(1024):
print(chunk)
This way, you only load small portions of the file into memory at a time.
4. Working with CSV and Excel Files Using Pandas
If you're working with structured data, Pandas is your best friend:
import pandas as pd
df = pd.read_csv("data.csv")
print(df.head())
To write a CSV:
df.to_csv("output.csv", index=False)
To handle Excel files:
df = pd.read_excel("data.xlsx", sheet_name="Sheet1")
df.to_excel("output.xlsx", index=False, sheet_name="Results")
Handling large CSVs in chunks:
chunk_size = 10000
for chunk in pd.read_csv("large_data.csv", chunksize=chunk_size):
print(chunk.shape)
Part 2: Parallel Processing in Python
Want to do more in less time? Use your CPU and I/O more efficiently by using parallelism.
1. Multithreading (Great for I/O-bound tasks)
import threading
def print_numbers():
for i in range(5):
print(i)
thread1 = threading.Thread(target=print_numbers)
thread2 = threading.Thread(target=print_numbers)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
Threads are great for tasks like:
- Downloading files
- Reading/writing files
- Making multiple API calls
2. Multiprocessing (Perfect for CPU-bound tasks)
from multiprocessing import Pool
def square(n):
return n * n
if __name__ == "__main__":
with Pool(4) as p:
result = p.map(square, [1, 2, 3, 4])
print(result)
Use this when you're crunching data or running heavy calculations.
3. concurrent.futures
- Simpler Parallelism
For I/O-bound tasks:
from concurrent.futures import ThreadPoolExecutor
def fetch_data(url):
return f"Fetched {url}"
urls = ["https://site1.com", "https://site2.com"]
with ThreadPoolExecutor() as executor:
results = executor.map(fetch_data, urls)
print(list(results))
For CPU-bound tasks:
from concurrent.futures import ProcessPoolExecutor
def cube(n):
return n ** 3
with ProcessPoolExecutor() as executor:
results = executor.map(cube, [1, 2, 3, 4])
print(list(results))
Part 3: Decorators — Python's Superpower
Decorators let you wrap functions with extra behavior.
1. A Simple Decorator
def my_decorator(func):
def wrapper():
print("Before function call")
func()
print("After function call")
return wrapper
@my_decorator
def say_hello():
print("Hello!")
say_hello()
2. Decorator with Arguments
def repeat(n):
def decorator(func):
def wrapper(*args, **kwargs):
for _ in range(n):
func(*args, **kwargs)
return wrapper
return decorator
@repeat(3)
def greet():
print("Hello!")
3. Using functools.wraps
import functools
def log(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
print(f"Calling {func.__name__} with {args}")
return func(*args, **kwargs)
return wrapper
@log
def add(a, b):
return a + b
print(add(2, 3))
functools.wraps
keeps the original function name and docstring intact.
Lambda Functions
Short, anonymous functions. Great for one-liners.
add = lambda x, y: x + y
print(add(5, 3)) # 8
List Comprehensions
squares = [x ** 2 for x in range(5)]
Dictionary Comprehensions
squares_dict = {x: x ** 2 for x in range(5)}
Assignment Questions (Practice Makes Perfect!)
Part 1: File Handling
- Write a Python program that reads a CSV file, filters rows where a specific column > 100, and writes the result to a new file.
- Modify the program to process large files in chunks.
Part 2: Parallel Processing
- Use multithreading to download multiple files simultaneously.
- Use multiprocessing to compute factorials of numbers from 1 to 10.
Part 3: Decorators
- Create a decorator that logs function execution time.
- Write a decorator that caches results of function calls.
Thanks for your time! Feel free to ask any questions!
Which concept would you like to see a deep dive on next? Let me know in the comments! 💬