Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Obsolete
Priority: Major
Fix Version/s: Backlog
Affects Version/s: 8.10
Component/s: Query Engine
Labels:
None

Currently, dependent joins create 1 or more IN clauses. Many MPP / NoSQL systems can have drastically better performance by creating temp tables that match key distributions. Two examples I know of would be Netezza and Hive.

In Netezza, if the incoming dependent join (small dimension; here "Customer" using Northwind data model concepts) has a key that will be joined to to a big fact table that is DISTRIBUTED ON or ORGANIZED BY 'ed then creating a temp table that matches this distribution will result in ~100x query performance. Sometimes, if the dimension is small enough, this doesn't make a big difference as Netezza will perform a broadcast join, but it's never a bad idea to create the temp table.

Similarly, Hive DDL has both partitions and buckets (pre-sorted).

Assignee:: Unassigned

Reporter:: John Muller (Inactive)

Votes:: 1 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2015/04/24 11:25 AM

Updated:: 2020/08/10 11:15 AM

Resolved:: 2020/08/10 11:15 AM

Details

Description

Attachments

Activity

People

Dates

Hide