Working with DB Metadata

SQL Expression Language allows composable construction of SQL queries. The foundation is that these queries are Python objects that represent DB concepts like tables and columns. These objects are known as “DB metadata” (database metadata).

Common objects are MetaDataTable, and Column.

The Metadata object is our root object, our starting point, the head of the tree, a “facade around a Python dictionary that stores a series of Table objects keyed to their string name”.

from sqlalchemy import MetaData
metadata_obj = MetaData()

Then declare (and attach?) Table objects:

from sqlalchemy import Table, Column, Integer, String
user_table = Table(
    "user_account",
    metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("name", String(30)),
    Column("fullname", String),
)

Now when we want to do anything related to the user’s table, we refer to user_table.

Best practices

  • Have a single MetaData object for an entire application. Keep it in a module-level variable in a single place (often in “models” or “dbschema”)
  • Commonly accessed via an ORM-centric registry or Declarative Base base class
  • Checkout Nominatim for a reference on how to structure a project. Spoiler: they keep their MetaData in /src/nominatim_api/sql/sqlalchemy_schema.py::SearchTables

Table construct

  • Docs
  • Represents SQL’s CREATE TABLE statement:
from sqlalchemy import Table, Column, Integer, String
user_table = Table(
    "user_account",
    metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("name", String(30)),
    Column("fullname", String),
)
>>> user_table.c.name
>>> Column('name', String(length=30), table=<user_account>)

To view all of the columns:

for col in user_table.columns:
	print(repr(col)) # or print(f"{col!r}")

outputs:

Column('id', Integer(), table=<user_account>, primary_key=True, nullable=False)
Column('name', String(length=30), table=<user_account>)
Column('fullname', String(), table=<user_account>)

Declaring Simple Constraints

from sqlalchemy import ForeignKey
address_table = Table(
    "address",
    metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("user_id", ForeignKey("user_account.id"), nullable=False),
    Column("email_address", String, nullable=False),
)

nullable=False is the same as IS NOT NULL in SQL.

Emitting DDL to the DB

Once you have the tables attached/added to the MetaData object, you can run a CREATE TABLE, or DDL to the SQLite db (from sqlalchemy-101)

metadata_obj.create_all(engine)

Cool tricks:

  • Checks if each table exists before emitting CREATE
  • Takes care of emitting CREATE statements in the correct order. Meaning constraint dependent (eg: FOREIGN KEY) tables get created after the table they depend on
  • MetaData also provides a MetaData.drop_all() that drops tables in reverse order as it would with CREATE

ORM Declarative Forms

Up to now, we use Table to create tables but SQLAlchemy ORM provides a facade around the Table declaration/creation process (referred as “Declarative Table”). The ORM provides a DeclarativeBase class. When we use the DeclarativeBase as a subclass of our new classes they will become an ORM mapped class. Each one typically referring to a particular Table object.

Under the hood SQLAlchemy inspects our new class, its class attributes and their associated metadata (data type, column name, etc). It then automatically creates a Table object and associates it with the new class. It also creates ORM mappings so we can work with database rows as instances of the new class.

So what’s “ORM mapping”? It is a mapping between any Python class we created (via DeclarativeBase), that classes attributes which will be linked to the columns in a database table. There are other ways of achieving this but the most common is known as declarative mapping (also see declarative config).

So what’s “declarative mapping”? The example above used imperative mapping which is the “classical” form to create a table construct. The declarative mapping is the typical way of creating mappings in modern SQLAlchemy:

from sqlalchemy import Integer, String, ForeignKey
from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
 
# declarative base class
class Base(DeclarativeBase):
	pass
 
class UserAccount(Base):
	__tablename__ = "user_account"
	id: Mapped[int] = mapped_column(primary_key=True)
	name: Mappend[str]
	fullname: Mapped[str] = mapped_column(String(30))

Okay… ORM mapping? Declarative mapping? The what …? The ORM mapping relies on the declarative mapping to know what to map. We tell SQLAlchemy how our Python classes and our database tables relate using the declarative mapping. And on runtime/execution, actually using those relationships to interact with the database is the ORM mapping.

Declarative Base

Now we can access the MetaData via the Base we declared above:

>>> Base.metadata
MetaData()

We also have access to the registry, which is the basis for maintaining a collection of mappings. It also provides configurational hooks used to map classes. The registry is seldom accessed directly but it is available:

>>> Base.registry
<sqlalchemy.orm.decl_api.registry object at 0x...>

Declaring Mapped Classes

Revisiting our User and Address declarations with declarative mapping. Notice the Mapped type, it is a special type which comes from PEP 484 :

from typing import List
from typing import Optional
 
from sqlalchemy import String
from sqlalchemy.orm import (
	DeclarativeBase,
	Mapped,
	mapped_column,
	relationship
	)
 
class Base(DeclarativeBase):
	pass
 
class User(Base):
	__tablename__ = "user_account"
 
	id: Mapped[int] = mapped_column(primary_key=True)
	name: Mapped[str] = mapped_column(String(30))
	addresses: Mapped[List["Address"]] = relationship(back_populates="user")
 
	def __repr__(self) -> str:
		return f"User(id={self.id!r}, name={self.name!r}, fullname={self.fullname!r})"
 
class Address(Base):
	__tablename__ = "address"
 
	id: Mapped[int] = mapped_column(primary_key=True)
	email_address: Mapped[str]
	user_id = mapped_column(ForeignKey("user_account.id"))
	user: Mapped[User] = relationship(back_populates="addresses")
 
	def __repr__(self) -> str:
		return f"Address(id={self.id!r}, email_address={self.email_address!r})"
	

Now if we want to emit the DDL to the DB from an ORM mapping we can:

>>> Base.metadata.create_all(engine)

References

https://docs.sqlalchemy.org/en/20/tutorial/metadata.html https://docs.sqlalchemy.org/en/20/orm/mapping_styles.html