Friendly SQL
Friendly SQL
DuckDB offers several advanced SQL features and syntactic sugar to make SQL queries more concise. We refer to these colloquially as “friendly SQL”.
Several of these features are also supported in other systems while some are (currently) exclusive to DuckDB.
Clauses
Creating tables and inserting data: CREATE OR REPLACE TABLE: avoid DROP TABLE IF EXISTS statements in scripts. CREATE TABLE ... AS SELECT (CTAS): create a new table from the output of a table without manually defining a schema. INSERT INTO ... BY NAME: this variant of the INSERT statement allows using column names instead of positions. INSERT OR IGNORE INTO ...: insert the rows that do not result in a conflict due to UNIQUE or PRIMARY KEY constraints. INSERT OR REPLACE INTO ...: insert the rows that do not result in a conflict due to UNIQUE or PRIMARY KEY constraints. For those that result in a conflict, replace the columns of the existing row to the new values of the to-be-inserted row. Describing tables and computing statistics: DESCRIBE: provides a succinct summary of the schema of a table or query. SUMMARIZE: returns summary statistics for a table or query. Making SQL clauses more compact and readable: FROM-first syntax with an optional SELECT clause: DuckDB allows queries in the form of FROM tbl which selects all columns (performing a SELECT * statement). GROUP BY ALL: omit the group-by columns by inferring them from the list of attributes in the SELECT clause. ORDER BY ALL: shorthand to order on all columns (e.g., to ensure deterministic results). SELECT * EXCLUDE: the EXCLUDE option allows excluding specific columns from the * expression. SELECT * REPLACE: the REPLACE option allows replacing specific columns with different expressions in a * expression. UNION BY NAME: perform the UNION operation along the names of columns (instead of relying on positions). Prefix aliases in the SELECT and FROM clauses: write x: 42 instead of 42 AS x for improved readability. Specifying a percentage of the table size for the LIMIT clause: write LIMIT 10% to return 10% of the query results. Transforming tables: PIVOT to turn long tables to wide tables. UNPIVOT to turn wide tables to long tables. Defining SQL-level variables: SET VARIABLE RESET VARIABLEQuery Features
Column aliases in WHERE, GROUP BY, and HAVING. (Note that column aliases cannot be used in the ON clause of JOIN clauses.) COLUMNS() expression can be used to execute the same expression on multiple columns: with regular expressions with EXCLUDE and REPLACE with lambda functions Reusable column aliases (also known as “lateral column aliases”), e.g.: SELECT i + 1 AS j, j + 2 AS k FROM range(0, 3) t(i) Advanced aggregation features for analytical (OLAP) queries: FILTER clause GROUPING SETS, GROUP BY CUBE, GROUP BY ROLLUP clauses count() shorthand for count(*) IN operator for lists and maps Specifying column names for common table expressions (WITH) Specifying column names in the JOIN clause Using VALUES in the JOIN clause Using VALUES in the anchor part of common table expressionsLiterals and Identifiers
Case-insensitivity while maintaining case of entities in the catalog Deduplicating identifiers Underscores as digit separators in numeric literalsData Types
MAP data type UNION data typeData Import
Auto-detecting the headers and schema of CSV files Directly querying CSV files and Parquet files Replacement scans: You can load from files using the syntax FROM 'my.csv', FROM 'my.csv.gz', FROM 'my.parquet', etc. In Python, you can access Pandas data frames using FROM df. Filename expansion (globbing), e.g.: FROM 'my-data/part-*.parquet'Functions and Expressions
Dot operator for function chaining: SELECT ('hello').upper() String formatters: the format() function with the fmt syntax and the printf() function List comprehensions List slicing and indexing from the back ([-1]) String slicing STRUCT.* notation Creating LIST using square brackets Simple LIST and STRUCT creation Updating the schema of STRUCTsJoin Types
ASOF joins LATERAL joins POSITIONAL joinsTrailing Commas
DuckDB allows trailing commas, both when listing entities (e.g., column and table names) and when constructing LIST items. For example, the following query works:
SELECT 42 AS x, ['a', 'b', 'c',] AS y, 'hello world' AS z, ;
"Top-N in Group" Queries
Computing the "top-N rows in a group" ordered by some criteria is a common task in SQL that unfortunately often requires a complex query involving window functions and/or subqueries.
To aid in this, DuckDB provides the aggregate functions max(arg, n), min(arg, n), arg_max(arg, val, n), arg_min(arg, val, n), max_by(arg, val, n) and min_by(arg, val, n) to efficiently return the "top" n rows in a group based on a specific column in either ascending or descending order.
For example, let's use the following table:
SELECT * FROM t1;
┌─────────┬───────┐ │ grp │ val │ │ varchar │ int32 │ ├─────────┼───────┤ │ a │ 2 │ │ a │ 1 │ │ b │ 5 │ │ b │ 4 │ │ a │ 3 │ │ b │ 6 │ └─────────┴───────┘
We want to get a list of the top-3 val values in each group grp. The conventional way to do this is to use a window function in a subquery:
SELECT array_agg(rs.val), rs.grp FROM (SELECT val, grp, row_number() OVER (PARTITION BY grp ORDER BY val DESC) AS rid FROM t1 ORDER BY val DESC) AS rs WHERE rid < 4 GROUP BY rs.grp;
┌───────────────────┬─────────┐ │ array_agg(rs.val) │ grp │ │ int32[] │ varchar │ ├───────────────────┼─────────┤ │ [3, 2, 1] │ a │ │ [6, 5, 4] │ b │ └───────────────────┴─────────┘
But in DuckDB, we can do this much more concisely (and efficiently!):
SELECT max(val, 3) FROM t1 GROUP BY grp;
┌─────────────┐ │ max(val, 3) │ │ int32[] │ ├─────────────┤ │ [3, 2, 1] │ │ [6, 5, 4] │ └─────────────┘ “Friendlier SQL with DuckDB” blog post “Even Friendlier SQL with DuckDB” blog post “SQL Gymnastics: Bending SQL into Flexible New Shapes” blog post
网址:Friendly SQL https://mxgxt.com/news/view/1670801
相关内容
明星sql数据库sql server关系图怎么看
SQL 多对多关系 联合主键
SQL Server关系引擎组件是什么?
使用SQL Editor连接StarRocks实例
SQL零基础逆袭!3步速成实战技巧
闪捷信息科技申请基于 SQL 语句的数据处理方法专利,能够准确地判定非法 SQL 语句
3.5k star!一款万能的 SQL 分析工具,厉害了!
SQL 语句中 where 条件后 写上 1=1 是什么意思!
Oracle SQL*Plus的DESCRIBE命令:数据结构的“侦探”