Creating a table in PostgreSQL (psql) involves defining its structure, including column names, data types, constraints, and optional default values. Let’s go through it step by step.


1. Basic Syntax

CREATE TABLE table_name (
    column1 datatype constraints,
    column2 datatype constraints,
    ...
);

Example: Creating a Users Table

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    age INTEGER CHECK (age >= 18),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Explanation

  • id SERIAL PRIMARY KEY → Auto-incrementing primary key.
  • name VARCHAR(100) NOT NULL → A string (max 100 chars), cannot be NULL.
  • email VARCHAR(100) UNIQUE NOT NULL → Unique email, required.
  • age INTEGER CHECK (age >= 18) → Integer, must be 18 or older.
  • created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP → Stores creation timestamp.

2. Common Data Types

Data Type Description
SERIAL Auto-incrementing integer (1, 2, 3...)
INTEGER Whole number
BIGINT Large whole number
VARCHAR(n) Variable-length string (max n chars)
TEXT Unlimited-length string
BOOLEAN True/False
DATE Stores a date (YYYY-MM-DD)
TIMESTAMP Date & time (YYYY-MM-DD HH:MI:SS)
DECIMAL(p, s) Fixed precision decimal (p = total digits, s = decimal places)
BYTEA Binary data (for storing images, files, etc.)

3. Constraints in Table Creation

Constraints enforce rules on the data stored in a table.

Constraint Description
PRIMARY KEY Uniquely identifies a record
UNIQUE Ensures unique values in a column
NOT NULL Prevents NULL values
CHECK(condition) Validates data before inserting/updating
DEFAULT value Sets a default value if none is provided
FOREIGN KEY Enforces referential integrity between tables

Example: Adding Foreign Keys

If you have a users table and an orders table, you can enforce a relationship:

CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    user_id INTEGER REFERENCES users(id) ON DELETE CASCADE,
    total_price DECIMAL(10,2) NOT NULL,
    order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
  • user_id INTEGER REFERENCES users(id) ON DELETE CASCADE → Ensures that when a user is deleted, their orders are also removed.

4. Creating a Table with Indexing

Indexes speed up queries. Example:

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    price DECIMAL(10,2) NOT NULL,
    category VARCHAR(100),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Create an index on the category column for faster searches
CREATE INDEX idx_category ON products(category);

5. Creating a Table with Composite Primary Key

A composite primary key uses multiple columns:

CREATE TABLE student_courses (
    student_id INTEGER,
    course_id INTEGER,
    grade CHAR(1),
    PRIMARY KEY (student_id, course_id)
);

This ensures that a student cannot enroll in the same course more than once.


6. Checking Created Tables

After creating a table, check its structure using:

\d table_name

or

SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'table_name';

7. Dropping and Altering a Table

Dropping a Table

DROP TABLE table_name;

Adding a Column

ALTER TABLE users ADD COLUMN phone_number VARCHAR(15);

Removing a Column

ALTER TABLE users DROP COLUMN phone_number;

This covers the essentials of table creation in PostgreSQL. Let me know if you need help with something specific! 🚀

Subqueries in PostgreSQL (psql) - Detailed Explanation

A subquery (also called a nested query or inner query) is a query that is embedded within another SQL query. It is used to retrieve data that will be used by the main query (also called the outer query).


1. Types of Subqueries

Subqueries can be classified into different types based on their usage:

Type Description
Scalar Subquery Returns a single value
Single-row Subquery Returns one row with one or more columns
Multi-row Subquery Returns multiple rows
Correlated Subquery Subquery depends on the outer query

2. Basic Syntax

SELECT column_name
FROM table_name
WHERE column_name = (SELECT column_name FROM another_table WHERE condition);

3. Scalar Subquery

A scalar subquery returns a single value. It can be used in SELECT, WHERE, or HAVING clauses.

Example: Fetch all users who placed the highest order.

SELECT name 
FROM users 
WHERE id = (SELECT user_id FROM orders ORDER BY total_price DESC LIMIT 1);
  • The subquery returns the user_id with the highest total_price order.
  • The outer query fetches the corresponding user’s name.

4. Single-row Subquery

A single-row subquery returns exactly one row and is used with comparison operators like =, >, <, >=, <=.

Example: Find the employee who earns the highest salary.

SELECT name, salary 
FROM employees 
WHERE salary = (SELECT MAX(salary) FROM employees);
  • The subquery calculates the highest salary.
  • The outer query finds the employee with that salary.

5. Multi-row Subquery

A multi-row subquery returns multiple rows and must be used with operators like IN, ANY, ALL.

Example: Find all products belonging to categories that have more than 5 products.

SELECT name 
FROM products 
WHERE category_id IN (
    SELECT category_id 
    FROM products 
    GROUP BY category_id 
    HAVING COUNT(*) > 5
);
  • The subquery finds category_ids with more than 5 products.
  • The outer query selects products from those categories.

6. Correlated Subquery

A correlated subquery executes once for each row of the outer query. It refers to a column from the outer query.

Example: Find employees who earn more than the average salary of their department.

SELECT e1.name, e1.salary, e1.department_id
FROM employees e1
WHERE e1.salary > (
    SELECT AVG(e2.salary) 
    FROM employees e2 
    WHERE e2.department_id = e1.department_id
);
  • The subquery calculates the average salary per department.
  • The outer query finds employees earning more than their department’s average.

7. Subquery in FROM Clause (Derived Table)

Subqueries in the FROM clause act like temporary tables.

Example: Find departments where the average salary is more than 50,000.

SELECT department_id, avg_salary
FROM (
    SELECT department_id, AVG(salary) AS avg_salary 
    FROM employees 
    GROUP BY department_id
) AS dept_avg
WHERE avg_salary > 50000;
  • The inner query (dept_avg) calculates the average salary per department.
  • The outer query filters departments where the average salary exceeds 50,000.

8. Subquery with EXISTS

The EXISTS operator checks if a subquery returns any rows.

Example: Find customers who have placed at least one order.

SELECT name 
FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o WHERE o.customer_id = c.id
);
  • The subquery checks if an order exists for a customer.
  • If a row exists, EXISTS returns TRUE, and the customer is included.

9. Subquery with NOT EXISTS

The NOT EXISTS operator finds records where the subquery returns no rows.

Example: Find customers who have not placed any orders.

SELECT name 
FROM customers c
WHERE NOT EXISTS (
    SELECT 1 FROM orders o WHERE o.customer_id = c.id
);
  • This query finds customers who have no orders.

10. Subquery Performance Tips

  • Use Indexing: Index foreign keys and frequently queried columns to speed up subqueries.
  • Avoid Correlated Subqueries if Possible: They execute once per row, which can be slow.
  • Use Joins Instead of Subqueries When Possible: Joins are often more efficient.

Example: Rewriting a Subquery with a Join

Instead of:

SELECT name 
FROM customers 
WHERE id IN (SELECT customer_id FROM orders);

Use:

SELECT DISTINCT c.name 
FROM customers c 
JOIN orders o ON c.id = o.customer_id;

Why?

  • The JOIN is generally faster than IN for large datasets.

Final Thoughts

  • Use scalar subqueries when expecting a single value.
  • Use multi-row subqueries with IN, ANY, or ALL.
  • Use correlated subqueries carefully as they can be slow.
  • Use EXISTS for better performance in checking record existence.

Let me know if you need a more advanced example (TBD)