Table Join Optimization

In the previous article, we mentioned that Jimmer can not only auto-generate count-query from data-query, but also auto-optimize the count-query.

Since count-query only cares about total row count regardless of order and format, some table joins in the original data-query may not need to be copied over to count-query.

Jimmer will auto-optimize count-query to copy over as few table joins as possible from the original data-query.

Optimization Rules

Table joins used by the where clause cannot be optimized.

That is, only table joins used solely by the select or order by clause of the top-level query can be optimized.
Table joins based on collection associations (one-to-many, many-to-many) cannot be optimized.

Collection associations lead to duplicate data, thus affecting row count. So these table joins must be copied to count-query and cannot be optimized.
Table joins based on reference associations (one-to-one, many-to-one) may be optimized, if any of the following conditions is met:
- Join type is left outer join
- Although join type is inner join, the association is based on non-null real foreign key.
  
  So-called real foreign key means there is a foreign key constraint in the database. See Real vs Fake Foreign Keys.

In summary, to determine whether a table join in the original data-query can be automatically removed in the count-query, use the following optimization rules:

AND	Is based on reference association, i.e. many-to-one or one-to-one
	Is used solely by select or orderBy clause of top-level query
	OR	Is left outer join
		Association is non-null and foreign key constraint exists in database

Non-Optimizable Scenario

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;
AuthorTableEx author = TableExes.AUTHOR_TABLE_EX;

ConfigurableRootQuery<BookTable, Book> query = sqlClient
    .createQuery(book)
    .where(
        book.price().between(
            new BigDecimal(20), 
            new BigDecimal(30)
        )
    )
    .orderBy(book.store().name())  
    .orderBy(book.name())
    .select(book);

int rowCount = query.fetchUnlimitedCount(); 

val query = sqlClient.createQuery(Book::class) {
    where(
        table.price.between(
            BigDecimal(20),
            BigDecimal(30)  
        )
    )
    orderBy(table.store.name) // α
    orderBy(table.name)
    select(table)
}  

val rowCount = query.fetchUnlimitedCount()

Note α:

Although table.store() is only used by orderBy clause without being used by where clause,

table.store() is inner join
Book.store association is nullable

So the optimization rules do not apply. The count-query still needs to retain table.store() and ultimately generates JOIN clause in SQL:

select
    count(tb_1_.ID)
from BOOK as tb_1_
inner join BOOK_STORE as tb_2_
    on tb_1_.STORE_ID = tb_2_.ID 
where tb_1_.PRICE between ? and ?

Optimizable Scenario

For the unoptimizable case discussed above, any of the following modifications will enable optimization:

Change Book.store association to non-null
Use left outer join

Here we choose the second approach, left outer join:

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;
AuthorTableEx author = TableExes.AUTHOR_TABLE_EX;

ConfigurableRootQuery<BookTable, Book> query = sqlClient
    .createQuery(book)
    .where(
        book.price().between(
            new BigDecimal(20),
            new BigDecimal(30)
        )
    )
    .orderBy(book.store(JoinType.LEFT).name())
    .orderBy(book.name()) 
    .select(book);

int rowCount = query.fetchUnlimitedCount();

val query = sqlClient.createQuery(Book::class) {
    where(
        table.price.between(
            BigDecimal(20),
            BigDecimal(30)
        )
    )
    orderBy(table.`store?`.name) // α  
    orderBy(table.name)
    select(table)
}

val rowCount = query.fetchUnlimitedCount() 

Now optimization can take effect. The final count-query generates SQL:

select
    count(tb_1_.ID)  
from BOOK as tb_1_
where tb_1_.PRICE between ? and ?

Optimization Rules​

Non-Optimizable Scenario​

Optimizable Scenario​

Optimization Rules

Non-Optimizable Scenario

Optimizable Scenario