Index: docs/reference/src/main/docbook/en-US/content/jcr/query_and_search.xml =================================================================== --- docs/reference/src/main/docbook/en-US/content/jcr/query_and_search.xml (revision 2347) +++ docs/reference/src/main/docbook/en-US/content/jcr/query_and_search.xml (working copy) @@ -137,8 +137,8 @@ while ( rowIter.hasNext() ) { session.logout(); ]]> - For more detail about these methods or about how to use other facets of the JCR query API, please consult Section 6.7 of the - JCR 1.0 specification. + For more detail about these methods or about how to use other facets of the JCR query API, please consult chapter 6 of the + JCR 2.0 specification. @@ -834,10 +834,21 @@ offset ::= /* Non-negative integer value */ value will be treated as a literal value. If the subquery is used in a clause that expects a single value (e.g., in a comparison), only the subquery's first row will be used. If the subquery is used in a clause that allows multiple values (e.g., IN (...)), then all of the subquery's rows will be used. - For example, this query "WHERE ... [my:type].[prop1] IN ( SELECT [my:prop2] FROM [my:type2] WHERE [my:prop3] < '1000' ) AND ..." + For example, this query "WHERE ... [my:type].[prop1] IN ( SELECT [my:prop2] FROM [my:type2] WHERE [my:prop3] < '1000' ) AND ..." will use the results of the subquery as the literal values in the IN clause. + + + Support for several pseudo-columns ("jcr:path", "jcr:score", "jcr:name", + "mode:localName", and "mode:depth") that can be used in the SELECT, + equijoin, and WHERE clauses. These pseudo-columns + make it possible to return location-related and score information within the &QueryResult;'s rows. + They also make queries look more like SQL, and thus may be more friendly and easier to use in existing + SQL-aware client applications. See the detailed description + for more information. + + @@ -906,7 +917,7 @@ JoinCondition ::= EquiJoinCondition | SameNodeJoinCondition | ]]> - + Equi-Join Conditions - + Constraints - + Property Existence Constraints Dynamic Operands + + Pseudo-columns + + The design of the JCR-SQL2 query language makes fairly heavy use of functions, including + SCORE(), NAME(), and LOCALNAME(). ModeShape adds several + more useful functions, including PATH() and DEPTH(), that follow the + same patterns. + + + However, there are several disadvantages of these functions. First, they make the JCR-SQL2 language + less "SQL-like", since SQL-92 and -99 don't define these kinds of functions. (There are aggregate + functions, like COUNT, SUM, etc., but they are not terribly analogous.) + This means that applications that use SQL and SQL-like query languages are less likely to be + able to build and issue JCR-SQL2 queries. + + + A second disadvantage of these functions is that JCR-SQL2 does not allow them to be used within + the SELECT clause. As a result, the location-related and score information cannot + be included as columns of values in the &QueryResult; rows. Instead, a client can only + access this information by obtaining the &Node; object(s) for each row. Relying upon both the result + set and additional Java objects makes it difficult to use. + + + For example, ModeShape's JDBC driver is designed to enable JDBC-aware applications to query + repository content using JCR-SQL2 queries. The standard JDBC API cannot expose the &Node; objects, + so the only way to return the path-related and score information is through additional columns + in the result. While such columns could "magically" appear in the result set, doing this is + not compatible with JDBC applications that dynamically build queries based upon database metadata. + Such applications require the columns to be properly described in database metadata, and the + columns need to be used within queries. + + + ModeShape attempts to solve these issues by directly supporting a number of "pseudo-columns" + within JCR-SQL2 queries, wherever columns can be used. These "pseudo-columns" include: + + + + jcr:score is a column of type DOUBLE that + represents the full-text search score of the node, which is a measure of the node's + relevance to the full-text search expression. ModeShape does compute the scores for all + queries, though the score for rows in queries that do not include a full-text search + criteria may not be reliable. + + + + + jcr:path is a column of type PATH that + represents the normalized path of a node, including same-name siblings. This is the same + as what would be returned by the getPath() method of &Node;. + Examples of paths include "/jcr:system" and "/foo/bar[3]". + + + + + jcr:name is a column of type NAME that + represents the node name in its namespace-qualified form using namespace prefixes and + excluding same-name-sibling indexes. + Examples of node names include "jcr:system", "jcr:content", "ex:UserData", and "bar". + + + + + mode:localName is a column of type STRING that + represents the local name of the node, which excludes the namespace prefix and same-name-sibling index. + As an example, the local name of the "jcr:system" node is "system", while the local name + of the "ex:UserData[3]" node is "UserData". + + + + + mode:depth is a column of type LONG that + represents the depth of a node, which corresponds exactly to the number of path segments within the path. + For example, the depth of the root node is 0, whereas the depth of the "/jcr:system/jcr:nodeTypes" node is 2. + + + + + + All of these pseudo-columns can be used in the SELECT clause of any JCR-SQL2 query, and their + use defines whether such columns appear in the result set. However, none of these pseudo-columns will be included + when "SELECT *" is used; instead, they must be explicitly named. + + + All of these pseudo-columns can be also be used in the WHERE clause of any JCR-SQL2 query, even + if they are not included in the SELECT clause. They can be used anywhere that a regular column + can be used, including within constraints and + dynamic operands. ModeShape will automatically rewrite + queries that use pseudo-columns in the dynamic operands to use the corresponding function, such as + SCORE(), PATH(), NAME(), LOCALNAME(), and DEPTH(). + Additionally, any property existence constraint using + these pseudo-columns will always evaluate to 'true' (and will thus be removed by the optimizer). + + + The jcr:path pseudo-column may also be used on both sides of an equijoin constraint + clause. For example: + + + + Equijoins of this form will be automatically rewritten by the optimizer to the following form: + + + + As with regular columns, the pseudo-columns must be qualified with the selector name if the query + contains more than one selector. + + + + Note that the jcr:path and jcr:score pseudo-columns are consistent with + the pseudo-columns of the same names used in JCR-SQL + query language. However, unlike in JCR-SQL, in JCR-SQL2 these columns are not automatically included + in the results. + + + + + Example JCR-SQL2 queries + + One of the simplest JCR-SQL2 queries finds all nodes in the current workspace of the repository: + + + + This query will return a result set containing a single "jcr:primaryType" column, since the nt:base defines only + one single-valued property called "jcr:primaryType". (The jcr:mixinTypes property is multi-valued, and as + such the JCR 2.0 specification does not require returning these in query results.) + + + Queries can explicitly specify the columns that are to be returned in the results. The following query is + semantically equivalent to the previous query, and produces identical results: + + + + Even though the "SELECT *" does not expand to include pseudo-columns, they can be explicitly included. + The following query will return the same rows as in the previous queries, but will have two additional columns + with values computed from the nodes' locations: + + + + In JCR-SQL2, a table representing a particular node type will have a column for each of the node type's property definitions, + including those inherited from supertypes. For example, the nt:file node type, its nt:hierarchyNode + supertype, and the mix:created mixin type are defined using the CND notation as follows: + + mix:created abstract + +[nt:file] > nt:hierarchyNode + + jcr:content (nt:base) primary mandatory +]]> + + Therefore, the table representing the nt:file node type will have two three columns: the jcr:created + and jcr:createdBy columns inherited from the mix:created mixin node type (via the nt:hierarchyNode node type), + and the jcr:primaryType column inherited from the nt:base node type, which is the implicit supertype + of the nt:hierarchyNode. Thus, this query: + + + + is equivalent to this query: + + + + Additionally, ModeShape's pseudo-columns can be used to also include the path of the resulting nt:file nodes and + to apply constraints based upon the local name of these nodes: + + + + Note that the local name constraint could be specified using the mode:localName pseudo-column supported by ModeShape. + This query is for all intents and purposes equivalent to the previous query and will produce the exact same results: + + + + Although this query looks much more like SQL, the use of the '[' and ']' characters to quote the identifiers is not typical of + a SQL dialect. ModeShape actually supports the using double-quote characters and square braces interchangeably around identifiers + (although they must match around any single identifier). Again, this next query, + which looks remarkably like any SQL-92 or -99 dialect, is functionally identical to the previous query: + + + + In JCR-SQL2, a node will appear as a row in each table that corresponds to the node types defined by that node's + primary type or mixin types, or any supertypes of these node types. In other words, a node will appear in the table + corresponding to each node type for which Node.isNodeType(...) returns true. + + + For example, consider a node that has a primary type of nt:file but has a mixin of mix:referenceable. + This node will appear as a result in the nt:file, mix:referenceable, nt:hierarchy, + mix:created, and nt:base. The table for nt:file contains all of the columns + in the nt:hierarchyNode, mix:referenceable, and nt:base. However, the + nt:file table does not contain the jcr:uuid column, since the + nt:file node type does not extend mix:referenceable. + Thus, to obtain the UUID for our node, we need to perform an identity join. The next + query shows how this is done to return all properties for nt:file nodes that are + also mix:referenceable: + + + + The select clause would be expanded to the following query: + + + + Of course, would could return even more information and make the query look very SQL-like by using pseudo-columns: + + + + These are examples of two-way inner joins, but ModeShape supports joining multiple tables together + in a single query. ModeShape also supports a variety of joins, including INNER JOIN (or just JOIN), + LEFT OUTER JOIN, RIGHT OUTER JOIN, FULL OUTER JOIN, and CROSS JOIN. + + + ModeShape supports several other query features beyond JCR-SQL2. One of these is support for UNION, + INTERSECT and EXCEPT. Here is an example of a union: + + + + ModeShape also supports using (non-correlated) subqueries within the WHERE clause, wherever a + static operand can be used. Subqueries can even be used within + another subquery. All subqueries, though, must return a single column, and each row's single value will be + treated as a literal value. If the subquery is used in a clause that expects a single value (e.g., in a comparison), + only the subquery's first row will be used. + + + Subqueries in ModeShape are a powerful and easy way to use more complex criteria that is a function of the content + in the repository, without having to resort to multiple queries (take the results of one query and dynamically + generate the criteria of another query). + + + Here's an example of a query that finds all nt:file nodes in the repository whose paths are referenced + in the vdb:originalFile property of the vdb:virtualDatabase nodes. (This query + also uses bind variables in the subquery.) + + + + Without subqueries, this query would need to be broken into two separate queries: the first would find all of the + paths referenced by the vdb:virtualDatabase nodes matching the version and description criteria, + followed by one (or more) subsequent queries to find the nt:file nodes with the paths expressed + as literal values (or bind variables). + + + The examples shown in this section hopefully begin to show the power and flexibility of JCR-SQL2 and the ModeShape extensions. + + Full-Text Search Language