DynamoDB single-table design, worked through - Writing

Fourth post in the AWS series, and the first one that does not need LocalStack Ultimate. DynamoDB runs on the free Hobby tier.

Repo: github.com/danieljohnmorris/dynamodb-single-table. It deploys one table with one GSI, seeds 13 rows across four entity types, and ships a runnable script per access pattern.

What single-table design is

The whole application’s data lives in one DynamoDB table, with multiple entity types sharing the partition and sort keys. Key characteristics:

One table holds every entity (customer, order, product, line item, in this example). No table per entity.
Primary key columns are generic: PK and SK are strings that get formatted differently per entity (CUST#c1, ORDER#o100, etc.).
A small number of global secondary indexes cover access patterns that the base table cannot serve. This example uses one.
Access patterns are designed up front and the keys are reverse-engineered from them. You do not start with entities, you start with the queries you need.
Joins are replaced with co-located rows: putting related items in the same partition lets one Query return them together. This shape is known as the adjacency list pattern.

The cost is rigidity. Adding a new access pattern after the design is shipped often means a new GSI or a backfill, sometimes both.

Why not one table per entity

The obvious alternative is the MongoDB shape: a Customers table, an Orders table, a Products table, a LineItems table. The answer is a tradeoff.

Multi-table buys flexibility. A new access pattern is a new Query against the relevant table, with no backfill or key redesign.
Multi-table pays at read time. Fetching an order with its line items and product details is three round trips your application code stitches together.
Single-table buys read efficiency. The same fetch is one Query because the rows already sit in the same partition.
Single-table pays at design time. You must know your access patterns before you ship, and changing them later is expensive.

The rule I use: pick single-table when access patterns are stable and read latency matters. Pick multi-table when the data model is still moving or reads are infrequent enough that the extra round trips do not register.

The example

Four entity types, six access patterns. The patterns:

#	Pattern	Operation
1	Get customer by ID	`GetItem(PK=CUST#<id>, SK=METADATA)`
2	Get all orders for a customer, newest first	`Query(GSI1, PK=CUST#<id>, SK begins_with ORDER#)`
3	Get an order with all line items	`Query(PK=ORDER#<id>)`
4	Get product by SKU	`GetItem(PK=PRODUCT#<sku>, SK=METADATA)`
5	Every order containing a SKU	`Query(GSI1, PK=PRODUCT#<sku>)`
6	List all customers chronologically	`Query(GSI1, PK=CUSTOMER)`

The keys

Centralised in lib/keys.ts:

export const customerKey = (id: string) =>
  ({ PK: `CUST#${id}`, SK: 'METADATA' });

export const orderHeaderKey = (orderId: string) =>
  ({ PK: `ORDER#${orderId}`, SK: 'METADATA' });

export const orderItemKey = (orderId: string, sku: string) =>
  ({ PK: `ORDER#${orderId}`, SK: `ITEM#${sku}` });

export const productKey = (sku: string) =>
  ({ PK: `PRODUCT#${sku}`, SK: 'METADATA' });

Two things to notice. The order header and order items share the partition ORDER#<id>. That is what lets pattern 3 fetch the whole order in one Query. And the SK prefix is what distinguishes them: METADATA for the header, ITEM#<sku> for each line item. A begins_with(SK, 'ITEM#') would filter to just the items if needed.

The GSI

Patterns 2, 5, and 6 cross partitions, so they need a GSI. The GSI keys are computed differently per entity:

// Customer: lives in a shared partition so listing is one Query
export const customerGsi = (createdAt: string) =>
  ({ GSI1PK: 'CUSTOMER', GSI1SK: createdAt });

// Order header: indexed by customer so a customer can find their orders
export const orderHeaderGsi = (customerId: string, createdAt: string, orderId: string) =>
  ({ GSI1PK: `CUST#${customerId}`, GSI1SK: `ORDER#${createdAt}#${orderId}` });

// Order item: indexed by SKU so a product can find every order containing it
export const orderItemGsi = (sku: string, orderId: string) =>
  ({ GSI1PK: `PRODUCT#${sku}`, GSI1SK: `ORDER#${orderId}` });

The same GSI1PK slot is doing three different jobs depending on the row’s entity type. CUSTOMER (a literal string) for the list-all-customers pattern. CUST#<id> for the find-orders-by-customer pattern. PRODUCT#<sku> for the find-orders-by-product pattern. This is the hallmark of single-table design: indexes are reused across entity types by varying what each row writes into the key columns.

Caveat on GSI1PK = CUSTOMER. Every customer row ends up in the same logical partition on GSI1. That works for the seed of three customers, and works fine up to mid-scale. At very large scale you would shard by first letter, signup month, region, or similar. The repo notes this in the script comment but does not implement the sharding.

Why GSI and not LSI

DynamoDB has a second index type, the local secondary index. Three differences matter:

An LSI reuses the base table’s partition key and only varies the sort key. A GSI lets you partition on a different attribute entirely.
LSIs must be declared at table creation. GSIs can be added or removed at any time.
LSIs share the base table’s read/write capacity. GSIs have their own.

For this design only a GSI works. Patterns 2, 5, and 6 each need a different partition key from the base table (CUST#<id>, PRODUCT#<sku>, and the literal CUSTOMER), and an LSI cannot change the partition key. Even when an LSI would have served, I default to GSI for new tables because it is the only index type you can add post-launch when a new access pattern shows up.

One Query for a whole order

Pattern 3 is the one that justifies single-table design more clearly than the others. The order header and every line item share the partition ORDER#<id>, so:

const res = await ddb.send(new QueryCommand({
  TableName: TABLE_NAME,
  KeyConditionExpression: 'PK = :pk',
  ExpressionAttributeValues: marshall({ ':pk': `ORDER#${orderId}` }),
}));

returns the METADATA row plus every ITEM# row in one round trip. In a relational schema this would be a join. Real output from the repo:

Order o101 for c1 (SHIPPED, £310)
  1 x AVON-TORTOISE @ £145
  1 x GRANBY-CLEAR @ £165

The script reads them straight from the response, no second call.

Reverse-chronological orders

Pattern 2 uses GSI1 with ScanIndexForward: false so the most recent order comes first:

const res = await ddb.send(new QueryCommand({
  TableName: TABLE_NAME,
  IndexName: 'GSI1',
  KeyConditionExpression: 'GSI1PK = :pk AND begins_with(GSI1SK, :sk)',
  ExpressionAttributeValues: marshall({
    ':pk': `CUST#${customerId}`,
    ':sk': 'ORDER#',
  }),
  ScanIndexForward: false,
}));

The GSI1SK for orders is ORDER#<isoCreatedAt>#<orderId>. ISO timestamps sort lexicographically the same way they sort by date, so descending lexicographic order is descending chronological order. Real output:

Orders for c1:
  o101  SHIPPED  £310  2026-04-10T14:00:00Z
  o100  DELIVERED  £145  2026-03-01T10:00:00Z

What this approach pays for

Picking up the four entity rows by primary key (customer, order header, order items, product) avoids the cross-table scatter-gather that a relational schema would do. The two GetItem patterns are single-digit milliseconds, and the three Query patterns each return a focused result set.

The price is the up-front design work. Adding a seventh access pattern, for example “orders by status in the last 30 days”, would require either a new GSI or a sparse index. Single-table design needs you to know the queries up front. Adding a new one after the table is live is expensive.

For an entity with a known set of queries that does not change often, this trade is good. For a product where the data shape is still in flux, it is worth knowing the cost.