SQL Style Guide: Formatting Best Practices for 2026
A SQL style guide is a set of conventions that make queries readable and keep a team’s diffs consistent. SQL itself does not care: keywords are case-insensitive and whitespace is ignored, so SELECT, select, and SeLeCt all run the same, and a 200-character one-liner returns the exact same rows as that query spread across twenty indented lines. Style is about the humans who read the query later, not the database.
If you just need a clean query right now, paste it into the SQL formatter, pick your dialect, and copy the result. Knowing the rules behind that output is what lets you set a team standard instead of arguing about it in every pull request. This guide walks through the choices that matter: keyword casing, indentation and line breaks, naming, dialect-specific quirks, and how to automate the whole thing.
One thing to keep in mind throughout. Because SQL ignores whitespace and casing, none of these rules are enforced by the database. They exist to serve the people who read, review, and maintain the query. That has two consequences. First, there is rarely a single “correct” answer; most of these decisions come down to picking a reasonable convention and applying it everywhere, and this guide is honest about where the real trade-offs lie rather than pretending one style wins outright. Second, since the rules are conventions and not requirements, they only pay off when you apply them consistently. That is why most sections land on the same advice: decide once, then let a tool enforce it.
Why SQL Formatting Matters
The clearest argument for formatting shows up in code review. An ORM or a build step often emits queries as a single unbroken line:
select u.id,u.name,count(o.id) as orders from users u left join orders o on o.user_id=u.id where u.active=true group by u.id,u.name order by orders desc
Nobody can review that. Reformatted, the structure is obvious and the diff is reviewable line by line:
SELECT
u.id,
u.name,
COUNT(o.id) AS orders
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.active = true
GROUP BY u.id, u.name
ORDER BY orders DESC;
Debugging benefits the same way. When you copy a one-line query out of a slow-query log and it has three joins and a tangled WHERE, formatting it first turns “where is the bug” into a thirty-second scan. The faulty predicate gets its own line, the joins are stacked, and an accidental Cartesian product or a forgotten filter becomes visible instead of buried in a wall of text. The same trick helps when you are reading SQL that some other system generated, since query builders and reporting tools are notorious for emitting correct but unreadable output.
Consistency is the quieter win, and over time the bigger one. When everyone formats the same way, diffs show only what actually changed, a new column or a tweaked filter, instead of noise from one person’s spacing preferences fighting another’s. A reviewer’s attention is finite, and spending it on reflowed whitespace is waste. Consistent formatting also makes onboarding easier: a new hire reads queries that all look alike, learns the team’s shape once, and applies it everywhere. None of this asks anyone to format by hand, which is the point of the final section. You decide the rules, a tool applies them.
One invariant sits under all of this, and it bears repeating because the rest of the guide leans on it: formatting only changes whitespace, line breaks, keyword casing, and comments. It never changes the query’s logic or its results. A formatted query is the same query. That is why you can safely run the messy version above through the SQL formatter without worrying about what it returns.
Keyword Casing: UPPERCASE vs lowercase
The oldest debate in any SQL style guide is whether reserved keywords should be UPPERCASE or lowercase. Both are valid, because SQL is case-insensitive for keywords. The disagreement is about readability, so it helps to understand both sides before you pick.
The case for UPPERCASE keywords
The traditional argument is visual contrast. Writing SELECT, FROM, WHERE, JOIN, and GROUP BY in capitals makes the keywords pop out from the lowercase table and column names, so you can scan the shape of a query, its clauses and structure, without relying on an editor to color them in.
That matters more than it sounds, because SQL travels through plenty of places that have no syntax highlighting: log files, email threads, pull request descriptions, a plain-text diff, a Slack message, a monitoring dashboard, a stack trace. In all of those, uppercase keywords are the only thing keeping the structure legible. Strip the highlighting away and select id from users where active is a soup of lowercase words, while SELECT id FROM users WHERE active still reads as a query at a glance. This is the convention in most older style guides, in database documentation, and in nearly every SQL textbook, which is also why a lot of developers find uppercase keywords more familiar even when their editor would color them anyway.
The case for lowercase keywords
The modern counter-argument is that syntax highlighting solved the contrast problem. Every editor and IDE colors keywords distinctly, so capitalizing them is redundant, and to some readers all-caps reads as shouting. It is also slightly faster to type without reaching for shift on every keyword.
The lowercase style has real momentum in the analytics-engineering world. The dbt community and several widely cited team style guides default to lowercase keywords, on the logic that highlighting carries the visual weight and lowercase keeps the query calmer to read. There is a subtler point in their favor too: lowercase keywords sit at the same visual level as your snake_case table and column names, so the whole query reads as one consistent block of text rather than shouting keywords and quiet identifiers competing for attention. Whether that is a feature or a bug is the kind of thing teams disagree about, which brings us to the verdict that actually holds up.
The verdict: consistency beats the choice
Which one you pick is far less important than picking one and enforcing it. A codebase where half the queries shout SELECT and the other half whisper select is the worst outcome, because the inconsistency itself becomes noise. Mixed casing in a single query is worse still.
Consistency wins for a mechanical reason, not an aesthetic one. Inconsistent casing makes diffs lie: a reviewer sees a line “change” that is really just someone reformatting a keyword, and the actual change hides in the noise. Grep and search get less reliable too when the same keyword shows up in three casings. A single enforced style removes that overhead at the cost of one decision. So decide as a team, write it down, and let a tool enforce it instead of relying on discipline. The SQL formatter has a Keywords control with three options (UPPERCASE, lowercase, and Preserve), so you can normalize an entire pile of historical queries to one style in a single click. The same query, rendered both ways:
-- UPPERCASE
SELECT id, email FROM users WHERE active = true ORDER BY created_at DESC;
-- lowercase
select id, email from users where active = true order by created_at desc;
Pick the one your team prefers. The point is that all your queries match it.
Indentation and Line Breaks
Casing decides how keywords look. Indentation and line breaks decide how the query’s logic maps onto the page, and that is where most of the readability lives.
The “river” style vs block style
Simon Holywell’s well-known sqlstyle.guide popularized the “river” style, where keywords are right-aligned so a vertical channel of whitespace runs down the middle of the query:
SELECT id,
email,
created_at
FROM users
WHERE active = true
ORDER BY created_at DESC;
The appeal is that SELECT, FROM, and WHERE line up on their right edge and the column list sits cleanly to the right of the river. The drawbacks are practical, though. The alignment depends on the length of your longest keyword, so adding a LEFT JOIN can force you to re-indent everything; it is painful to maintain by hand; and it produces noisy diffs, because changing one keyword’s length shifts the whitespace on neighboring lines.
The block (or left-aligned) style starts each major clause at the left margin on its own line and indents the clause’s contents:
SELECT
id,
email,
created_at
FROM users
WHERE active = true
ORDER BY created_at DESC;
This is the mainstream default and the one most tools produce, mostly because it is stable: adding a clause never reflows the lines above it, so diffs stay small and the layout survives automated formatting. The river style optimizes for how a finished query looks in isolation; the block style optimizes for how queries change over time and how they review in version control. For anything that lives in a repository and gets edited, the block style is the safer bet, and it is what the rest of this guide assumes.
How many spaces: 2 vs 4 vs tabs
Once you indent, you have to decide how far. The three common answers each have a rationale:
- 2 spaces is the most common default. It keeps diffs compact and nested queries from marching off the right edge of the screen.
- 4 spaces gives each level of nesting more visual separation, which helps in queries with deep subqueries or many CTE levels.
- Tabs let every developer pick their own display width without changing the file.
There is no universally correct answer here, which is why the SQL formatter exposes an Indent control with all three (2 spaces, 4 spaces, Tab). Choose one and apply it everywhere.
Where to break lines
Indentation width is the easy part. The higher-impact decision is where to insert line breaks:
SELECTcolumns: one column per line for anything non-trivial, so adding or removing a column touches exactly one line in the diff. Very short queries can stay on a single line.FROMandJOIN: start each join on its own line, with theONcondition either trailing or indented beneath it. This keeps the join graph readable.WHERE: put eachAND/ORon its own line so the boolean logic reads top to bottom. For mixedAND/ORconditions, parenthesize and indent the groups so the precedence is explicit rather than something the reader has to work out.
These are guidelines, not laws. A trivial SELECT id FROM users WHERE id = 1 does not need five lines, and forcing it onto them hurts readability rather than helping. The rough rule of thumb: break when the query has more than one or two columns, more than one table, or more than one condition. Below that threshold a single line is clearer; above it, break aggressively. A good formatter encodes a sensible threshold for you, but knowing the principle means the output never surprises you.
Applied to the messy one-liner from earlier, those rules produce a layout where every clause and every join is visible at a glance:
SELECT
u.id,
u.name,
COUNT(o.id) AS orders
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.active = true
GROUP BY u.id, u.name
ORDER BY orders DESC;
Leading vs trailing commas
A smaller but persistent question: where does the comma go in a multi-line column list?
-- Leading commas
SELECT
id
, email
, created_at
FROM users;
-- Trailing commas
SELECT
id,
email,
created_at
FROM users;
Leading commas have a genuine advantage: adding or removing a column changes a single line, and a missing comma is easy to spot because the offending line stands out. Trailing commas read more naturally and are far more common in practice. Both are fine. Pick one, and let the formatter apply it so nobody has to think about it again.
Naming Conventions for Tables and Columns
Formatting governs whitespace; naming governs the identifiers themselves, and a style guide is incomplete without it.
The de facto standard for SQL identifiers is snake_case: all lowercase, words separated by underscores, like user_id, created_at, and order_items. It earns that status for a concrete reason, not just habit. snake_case identifiers never need quoting and behave consistently across dialects, whereas camelCase (common in application code) collides with how databases fold case, which we will get to in a moment.
This differs from application code in a way worth spelling out. In most programming languages, the surrounding code controls the identifiers, and camelCase or PascalCase is the norm. SQL identifiers, by contrast, are interpreted by the database’s own case-folding rules, and those rules are what makes mixed-case names fragile. snake_case dodges the whole issue: there is no case to fold, no reason to quote, and nothing that behaves differently from one engine to the next.
A few more conventions that show up in nearly every SQL style guide:
- Singular vs plural table names is a genuine split.
users(plural, “the table holds users”) anduser(singular, “each row is a user”) both have advocates. Like casing, the choice matters less than applying it consistently to every table. - Avoid reserved words as identifiers. Naming a column
order,user, ortableforces you to quote it everywhere and invites confusing errors. Reach fororder_idoraccountinstead. - Keep key naming consistent. A primary key called
idand foreign keys named<referenced_table>_id(such asuser_id) make joins predictable and self-documenting.
One trap bites teams that name database columns the way they name application variables. In PostgreSQL, an unquoted identifier is folded to lowercase, so SELECT userId FROM t actually looks for a column named userid. The moment you quote it as "userId", the database preserves the case and treats "userId" and userid as two different columns:
-- Creates a column whose real name is lowercase "userid"
CREATE TABLE t (userId integer);
-- Both of these work — the name was folded to lowercase
SELECT userId FROM t;
SELECT userid FROM t;
-- This fails: "column \"userId\" does not exist"
-- The quotes force an exact, case-sensitive match
SELECT "userId" FROM t;
Different databases fold case in different directions (Oracle folds unquoted identifiers to uppercase, several others to lowercase), so mixed-case quoted identifiers are not even portable. The clean way out is to avoid quoted, mixed-case identifiers entirely and stick to snake_case, which sidesteps the whole problem and keeps your schema readable in every dialect.
For a deeper comparison of camelCase, snake_case, and kebab-case, including when each one is the right call across code and data, see the naming conventions guide.
Formatting Across SQL Dialects
Everything so far is largely dialect-agnostic. Casing, indentation, line breaks, and naming apply no matter which database you target. But “format this SQL” runs into a wall the moment your query uses syntax specific to one database, because a generic parser that does not recognize that syntax will mangle it: it may split a token in the wrong place, misread an operator, or treat a quoting character as a string delimiter and swallow half the query. This is where dialect-aware formatting earns its keep, and it is why the formatter asks you to pick a database first rather than guessing. The differences below are the ones you hit most often in everyday queries.
Here is how the four most common dialect-divergent operations compare at a glance:
| Operation | PostgreSQL | MySQL / MariaDB | SQL Server (T-SQL) | Oracle | Standard SQL |
|---|---|---|---|---|---|
| String concatenation | || or CONCAT() | CONCAT() | + or CONCAT() | || or CONCAT() | || |
| NULL fallback | COALESCE() | COALESCE() / IFNULL() | COALESCE() / ISNULL() | COALESCE() / NVL() | COALESCE() |
| Row limiting | LIMIT | LIMIT | TOP / OFFSET … FETCH | FETCH FIRST | FETCH FIRST |
| Identifier quoting | double quotes ("…") | backticks | square brackets ([…]) | double quotes ("…") | double quotes ("…") |
The sections below show each of these in runnable context.
String concatenation and NULL handling
Two of the most common everyday operations are spelled differently across dialects.
String concatenation:
-- PostgreSQL, Oracle, SQLite (standard operator)
SELECT first_name || ' ' || last_name AS full_name FROM users;
-- SQL Server (T-SQL uses +)
SELECT first_name + ' ' + last_name AS full_name FROM users;
-- Portable across dialects
SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM users;
NULL fallbacks:
-- Standard SQL (works everywhere)
SELECT COALESCE(nickname, name) AS display_name FROM users;
-- SQL Server only
SELECT ISNULL(nickname, name) AS display_name FROM users;
-- MySQL / MariaDB only
SELECT IFNULL(nickname, name) AS display_name FROM users;
A formatter set to the wrong dialect may not understand ISNULL or the || operator and can misparse the surrounding query.
Row limiting and identifier quoting
Limiting result rows is one of the most dialect-divergent pieces of syntax:
-- PostgreSQL, MySQL, SQLite
SELECT id, name FROM users ORDER BY created_at DESC LIMIT 10;
-- SQL Server (T-SQL)
SELECT TOP 10 id, name FROM users ORDER BY created_at DESC;
-- Standard SQL / Oracle
SELECT id, name FROM users ORDER BY created_at DESC FETCH FIRST 10 ROWS ONLY;
Identifier quoting splits three ways too. When you must quote an identifier, usually to use a reserved word or preserve case, the delimiter depends on the database:
-- MySQL / MariaDB use backticks
SELECT `order`, `user` FROM `select`;
-- SQL Server (T-SQL) uses square brackets
SELECT [order], [user] FROM [select];
-- Standard SQL (PostgreSQL, Oracle, SQLite) uses double quotes
SELECT "order", "user" FROM "select";
A formatter that thinks MySQL backticks are string delimiters, or that T-SQL brackets are something else, will produce broken output. The dialect setting tells it which is which. This is also why copy-pasting a query between databases is rarely a clean swap: the same logical intent, whether that is concatenating two strings, falling back on NULL, limiting to ten rows, or quoting a reserved word, is written four different ways across the dialects, and only a parser that knows your database can reformat it without corrupting it.
Why dialect-aware formatting matters
This is why the SQL formatter ships with nine dialects (PostgreSQL, MySQL, SQL Server (T-SQL), BigQuery, Snowflake, Oracle, SQLite, MariaDB, and Standard SQL) rather than a single generic mode. Selecting the right one means the parser correctly handles PostgreSQL dollar-quoted strings and :: casts, T-SQL bracketed identifiers and TOP, warehouse-specific functions in BigQuery and Snowflake, and the quoting rules above, instead of guessing and getting them wrong. Pick your actual database from the dropdown before you format, and the output comes back correct and idiomatic.
Automating SQL Formatting
Reading the rules is one thing; nobody should be applying them by hand. The whole point of a style guide is that a machine enforces it. There are three places to wire formatting in, depending on how much friction you want to cut.
In your editor (format on save)
The lowest-effort option is to format automatically every time you save. VS Code has SQL formatter extensions that run on save, and JetBrains DataGrip and the database tools in other IDEs ship with a built-in formatter you can bind to a keystroke or a save hook. Once it is set up, your queries are never unformatted, the same model as Prettier for JavaScript or gofmt for Go. The catch is that editor settings live on each developer’s machine, so format-on-save keeps your SQL tidy but does not, on its own, guarantee the rest of the team’s does. For that you need the next layer.
In CI with a linter
To enforce style across a whole team, move the check into continuous integration. A SQL linter such as sqlfluff both lints and auto-fixes: you encode your rules (dialect, keyword casing, indentation, comma placement) in a .sqlfluff config file, run sqlfluff lint to flag violations and sqlfluff fix to repair them, and have CI fail any pull request that drifts from the agreed style. This is the same idea as ESLint or Prettier gating a frontend repo: style stops being a review comment that someone has to remember to leave and becomes a passing or failing check that the machine never forgets. The payoff is that style debates happen once, when you write the config, instead of in every pull request.
One-off online formatting
Sometimes you just have one ugly query and no desire to install anything: a snippet from a log, a colleague’s Slack message, a query you are pasting into documentation. For that, paste it into the SQL formatter, choose the dialect, casing, and indentation, and copy the clean result.
The privacy detail matters here, and it is easy to overlook. Many online formatters send the text you paste to a server to do the work, which means a copy of your query, including table names, column names, and sometimes literal values from a production incident, leaves your machine. The SQL formatter runs entirely in your browser, so your SQL is never uploaded anywhere. That makes it safe to format queries that touch production schemas or proprietary logic, which is exactly when you most want clean formatting and least want to hand your query to a third party. If you are wrangling other formats in the same workflow, the sibling JSON formatter works the same way, with in-browser processing and one-click copy.
The three approaches are not mutually exclusive, and the best setup usually combines them: format-on-save for the fast inner loop while you write, a CI linter as the backstop that enforces the team standard, and an online formatter for the throwaway snippets that never touch your repo. Whichever you reach for, remember the invariant one last time: none of these tools change what your query does. They rearrange whitespace, line breaks, casing, and comments, and nothing else.
Frequently Asked Questions
Should SQL keywords be uppercase or lowercase?
Both are valid, because SQL keywords are case-insensitive. UPPERCASE makes keywords stand out in environments without syntax highlighting, such as logs and diffs; lowercase is easier to type and fits modern editors that already color keywords. The thing that matters is that the whole team picks one and a formatter enforces it. Mixed casing is the worst choice.
What is the best indentation for SQL?
Two spaces is the most common default and keeps diffs compact; four spaces makes deeply nested queries easier to read; tabs let each developer choose their own display width. There is no single correct answer, so pick one and apply it consistently across your team. Most SQL formatters, including this one, support all three options.
How do I format SQL automatically?
There are three ways to format SQL automatically: format-on-save in your editor (VS Code or DataGrip), a linter in CI such as sqlfluff that auto-fixes style, or an online SQL formatter for one-off pasting. The online route is fastest because it needs no installation. Just paste, pick your dialect, and copy the result.
Should I use leading or trailing commas in SQL?
Leading commas (comma at the start of each line) give cleaner diffs when adding or removing columns and make a missing comma easy to spot; trailing commas (comma at the end) read more naturally and are more common. Both are acceptable in any SQL style guide, so the key is choosing one and letting a formatter apply it automatically.
Does formatting SQL change how the query runs?
No. Formatting SQL only changes whitespace, line breaks, keyword casing, and comments. It never changes the query’s logic. A formatted query returns exactly the same results as the original, which is why it is safe to beautify even production queries before reviewing or running them.
What naming convention should I use for SQL tables and columns?
snake_case, all lowercase with underscores, is the de facto standard for SQL table and column names because it avoids quoting and stays safe across dialects. Keep primary keys (id) and foreign keys (user_id) named consistently, and avoid using reserved words like order or user as identifiers to prevent quoting headaches.
How do I format SQL for a specific dialect like PostgreSQL or T-SQL?
Select the matching dialect in the formatter first. PostgreSQL mode correctly handles :: casts and dollar-quoted strings; SQL Server (T-SQL) mode understands bracketed [identifiers] and TOP. Choosing the wrong dialect lets a generic parser mangle dialect-specific syntax, so always set it to your real database before formatting.
Is there a standard SQL style guide?
There is no official standard, but several widely referenced ones exist: Simon Holywell’s sqlstyle.guide and the public style guides from teams like Mozilla and the dbt community. Their shared consensus, namely consistent indentation, snake_case identifiers, and a line break before each major clause, is what this guide codifies, and a formatter can enforce it for you.