CLUSTER 中文man页面-cluster的中文

NAME

CLUSTER - 根据一个索引对某个表集簇

SYNOPSIS

CLUSTER indexname ON tablename
CLUSTER tablename
CLUSTER

DESCRIPTION 描述

CLUSTER 指示PostgreSQL 基于索引 indexname 的度量对表 table 进行存储建簇。索引必须已经在表 tablename. 上定义。

当对一个表建簇后，该表的物理存储将基于索引信息进行。建簇是一次性操作：也就是说，当表随后被更新后，改变的内容不会建簇。也就是说，系统不会试图按照索引顺序对更新过的记录重新建簇。如果需要，可以通过周期性地手工执行该命令的方法重建簇。

在对一个表建簇之后，PostgreSQL 会记忆在哪个索引上建立的簇。 CLUSTER tablename 的形式就在表以前建簇的同一个索引上建簇。

没有任何参数的 CLUSTER 将导致当前数据库里所有调用它的用户所有的表都被建簇。（绝不会对不包括进来的表建簇。）这种形式的 CLUSTER 不能在一个事务或者函数里面调用。

在对一个表进行建簇的时候，则在其上请求一个 ACCESS EXCLUSIVE 锁。这样就避免了在 CLUSTER 完成之前执行任何其他的数据库操作（包括读写）。参阅 Section 12.3 获取有关数据库锁定的更多信息。

PARAMETERS 参数

indexname: 一个索引名称。
tablename: 准备建簇的表的名称（可能有模式修饰）。

NOTES 注意

如果你只是随机的访问表中的行，那么在堆表中的数据的实际存储顺序是无关紧要的。但是，如果你对某些数据的访问多于其他数据，而且有一个索引将这些数据分组，那你就将从使用 CLUSTER 中获益。如果你从一个表中请求一定索引范围的值，或者是一个索引过的值对应多行， CLUSTER 也会有助于应用，因为如果索引标识出***匹配行所在的堆存储页，所有其他行也可能已经在同一堆存储页里了，这样便节省了磁盘访问的时间，加速了查询。

在这个建簇的操作过程中，系统先创建一个按照索引顺序建立的表的临时拷贝。同时也建立表上的每个索引的临时拷贝。因此，你需要磁盘上有足够的剩余空间，至少是表大小和索引大小的和。

因为 CLUSTER 记忆建簇信息，我们可以在***次的时候手工对表进行建簇，然后设置一个类似 VACUUM 的时间，这样我们就可以周期地自动对表进行建簇了。

因为规划器记录着有关表的排序的统计，所以我们建议在新近建簇的表上运行 ANALYZE。否则，规划器可能会选择很差劲的查询规划。

还有一种建簇的方法。 CLUSTER 命令将原表按你声明的索引重新排列。这个动作在操作大表时可能会很慢，因为每一行都从堆存储页里按索引顺序取出，如果存储页表没有排序，整个表是随机存放在各个页面的，因而移动的每一行都要进行一次磁盘页面操作。 PostgreSQL 有一个缓冲，但一个大表的主体是不可能都放到缓冲去的。另外一种对表建簇的方法是

CREATE TABLE newtable AS
    SELECT columnlist FROM table ORDER BY columnlist;

这个用法使用PostgreSQL 排序的代码 ORDER BY 来创建一个需要的顺序，在对未排序的数据操作时通常速度比索引扫描快得多。然后你可以删除旧表，用 ALTER TABLE ... RENAME将 newtable 改成旧表名，并且重建该表所有索引。但是，这个方法不保留 OID，约束，外键关系，赋予的权限，以及表的其它附属的属性 ---- 所有这些属性都必须手工重建。

EXAMPLES 例子

以雇员的 emp_ind 属性对employees关系建簇。

CLUSTER emp_ind ON emp;

使用以前用过的同一个索引对employees表进行建簇：

CLUSTER emp;

对以前建过簇的所有表进行建簇：

CLUSTER;

COMPATIBILITY 兼容性

在 SQL 标准里没有 CLUSTER 语句。

NAME

CLUSTER - cluster a table according to an index

SYNOPSIS

CLUSTER indexname ON tablename
CLUSTER tablename
CLUSTER

DESCRIPTION

CLUSTER instructs PostgreSQL to cluster the table specified by tablename based on the index specified by indexname. The index must already have been defined on tablename.

When a table is clustered, it is physically reordered based on the index information. Clustering is a one-time operation: when the table is subsequently updated, the changes are not clustered. That is, no attempt is made to store new or updated rows according to their index order. If one wishes, one can periodically recluster by issuing the command again.

When a table is clustered, PostgreSQL remembers on which index it was clustered. The form CLUSTER tablename, reclusters the table on the same index that it was clustered before.

CLUSTER without any parameter reclusters all the tables in the current database that the calling user owns, or all tables if called by a superuser. (Never-clustered tables are not included.) This form of CLUSTER cannot be called from inside a transaction or function.

When a table is being clustered, an ACCESS EXCLUSIVE lock is acquired on it. This prevents any other database operations (both reads and writes) from operating on the table until the CLUSTER is finished.

PARAMETERS

indexname: The name of an index.
tablename: The name (possibly schema-qualified) of a table.

NOTES

In cases where you are accessing single rows randomly within a table, the actual order of the data in the table is unimportant. However, if you tend to access some data more than others, and there is an index that groups them together, you will benefit from using CLUSTER. If you are requesting a range of indexed values from a table, or a single indexed value that has multiple rows that match, CLUSTER will help because once the index identifies the heap page for the first row that matches, all other rows that match are probably already on the same heap page, and so you save disk accesses and speed up the query.

During the cluster operation, a temporary copy of the table is created that contains the table data in the index order. Temporary copies of each index on the table are created as well. Therefore, you need free space on disk at least equal to the sum of the table size and the index sizes.

Because CLUSTER remembers the clustering information, one can cluster the tables one wants clustered manually the first time, and setup a timed event similar to VACUUM so that the tables are periodically reclustered.

Because the planner records statistics about the ordering of tables, it is advisable to run ANALYZE on the newly clustered table. Otherwise, the planner may make poor choices of query plans.

There is another way to cluster data. The CLUSTER command reorders the original table using the ordering of the index you specify. This can be slow on large tables because the rows are fetched from the heap in index order, and if the heap table is unordered, the entries are on random pages, so there is one disk page retrieved for every row moved. (PostgreSQL has a cache, but the majority of a big table will not fit in the cache.) The other way to cluster a table is to use

CREATE TABLE newtable AS
    SELECT columnlist FROM table ORDER BY columnlist;

which uses the PostgreSQL sorting code in the ORDER BY clause to create the desired order; this is usually much faster than an index scan for unordered data. You then drop the old table, use ALTER TABLE ... RENAME to rename newtable to the old name, and recreate the table's indexes. However, this approach does not preserve OIDs, constraints, foreign key relationships, granted privileges, and other ancillary properties of the table --- all such items must be manually recreated.

EXAMPLES

Cluster the table employees on the basis of its index emp_ind:

CLUSTER emp_ind ON emp;

Cluster the employees relation using the same index that was used before:

CLUSTER emp;

Cluster all the tables on the database that have previously been clustered:

CLUSTER;

COMPATIBILITY

There is no CLUSTER statement in the SQL standard.

CLUSTER 中文man页面

NAME

SYNOPSIS

DESCRIPTION 描述

PARAMETERS 参数

NOTES 注意

EXAMPLES 例子

COMPATIBILITY 兼容性

SEE ALSO 参见

NAME

SYNOPSIS

DESCRIPTION

PARAMETERS

NOTES

EXAMPLES

COMPATIBILITY

SEE ALSO