Optimize Unnecessary Joins

Usage

Basic Concepts

In Jimmer SQL DSL, created join objects that are not used will be ignored. For example:

Java
Kotlin

BookTable table = Tables.BOOK_TABLE;

System.out.println("Unused join: " + table.store());

List<Book> books = sqlClient
    .createQuery(table) 
    .orderBy(table.name().asc(), table.edition().desc())
    .select(table)
    .execute();

val books = sqlClient
    .createQuery(Book::class) {
        
        println("Unused join: ${table.store}")

        orderBy(table.name.asc(), table.edition.desc())
        select(table)
    }
    .execute()

Although a JOIN object is created via table.store, since it is not used by the SQL DSL of the current query, this JOIN object will be ignored.

Therefore, the final generated SQL will not contain any JOIN operations:

select
    tb_1_.ID,
    tb_1_.NAME, 
    tb_1_.EDITION,
    tb_1_.PRICE,
    tb_1_.STORE_ID
from BOOK as tb_1_ 
where tb_1_.NAME = ?

Another Solution

, we listed two scenarios.

In the previous doc, we discussed solving these two scenarios by utilizing Jimmer SQL DSL's automatic merge of conflicting joins.

Here, we use another approach to solve the same problems.

Solve the problem in scenario 1:

Java
Kotlin

List<Book> findBooks(
        @Nullable String name,
        @Nullable String storeName,
        @Nullable String storeWebsite
) {
    BookTable table = Tables.BOOK_TABLE;

    // Create JOIN object unconditionally first,  
    // will be auto ignored if not used later,
    // so no useless JOIN
    BookStoreTable store = table.store();

    return sqlClient
        .createQuery(table)
        .whereIf(
            name != null,
            () -> book.name().like(name)
        )
        .whereIf(
            storeName != null, 
            () -> store.name().like(storeName)
        )
        .whereIf(
            storeWebsite != null,
            () -> store.website().like(storeWebsite)
        )
        .select(book)
        .execute();
}

fun findBooks(
    name: String? = null,
    storeName: String? = null,
    storeWebsite: String? = null
): List<Book> {
    return sqlClient
        .createQuery(Book::class) {
            
            // Create JOIN object unconditionally first,
            // will be auto ignored if not used later, 
            // so no useless JOIN
            val store = table.store
            
            name?.let {
                where(table.name like it)
            } 
            storeName?.let {
                where(store.name like it) 
            }
            storeWebsite?.let {
                where(store.website like it)
            }
            select(table)
        }
        .execute()
}

If only name is specified, without storeName and storeWebsite, then store is a created but unused join object, thus ignored. The final generated SQL contains no joins.
```
select
    tb_1_.ID,
    tb_1_.NAME,
    tb_1_.EDITION,
    tb_1_.PRICE,
    tb_1_.STORE_ID
from BOOK as tb_1_
where tb_1_.NAME = ?
```
If storeName and storeWebsite are specified, store will be used, leading to join operations in the final generated SQL. This is obvious so no example here.

Solve the problem in scenario 2:

Java
Kotlin

List<Long> findDistinctIds(
    @Nullable Long aId,
    @Nullable Long bId,
    @Nullable Long cId,
    @Nullable Long dId,
    @Nullable Long eId
) {
    ATable table = Tables.A_TABLE;

    // Create JOIN objects unconditionally first,
    // Not used join objects will be auto ignored,
    // so no useless JOIN objects 
    BTableEx b = table.asTableEx().bs();
    CTableEx c = b.cs();
    DTableEx d = c.ds();
    ETableEx e = d.es();

    return sqlClient
        .createQuery(table)
        .whereIf(
            aId != null,
            () -> table.id().like(aId)
        )
        .whereIf(
            bId != null,  
            () -> b.id().like(bId)
        )
        .whereIf(
            cId != null,
            () -> c.id().like(cId)
        )
        .whereIf(
            dId != null,
            () -> d.id().like(dId)
        )
        .whereIf(
            eId != null,
            () -> e.id().like(eId)
        )
        .select(book.id())
        .distinct()
        .execute();
}

fun findDistinctIds(
    aId: Long? = null,
    bId: Long? = null,
    cId: Long? = null,
    dId: Long? = null,
    eId: Long? = null  
): List<Long> {
    
    // Create JOIN object unconditionally first,
    // will be auto ignored if not used later,
    // so no useless JOIN
    val b = table.asTableEx().bs
    val c = b.cs
    val d = c.ds
    val e = d.es

    return sqlClient
        .createQuery(A::class) {
            aId?.let {
                where(table.id eq it)
            }
            bId?.let {
                where(b.id eq it)
            }
            cId?.let {
                where(c.id eq it)  
            }
            dId?.let {
                where(d.id eq it)
            }
            eId?.let {
                where(e.id eq it)
            }
            select(table.id) 
        }
        .distinct()
        .execute()
}

info

Given the basics above, no need to list the generated SQL for different parameter combinations here. Just understand that the final SQL will not contain useless joins.
When a JOIN object is used by the SQL DSL, it will be marked as "used" to prevent being ignored. This mark is transitive.

For example: Marking d as "used" will naturally also mark c, b and table.
asTableEx here (and various TableEx types in Java) is a concept to be introduced in later docs Pagination Safety. Please ignore it for now.

Phantom Joins

Two ways to filter on associated ids

For foreign key based associations, there are two ways to filter on the associated id:

Use the foreign key property directly

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;

List<Book> books = sqlClient
    .createQuery(book)
    .where(
        book
        .storeId()
        .eq(2L)  
    )
    .select(book)
    .execute();

val books = sqlClient
    .createQuery(Book::class) {
        where(
            table
            .storeId
            eq 2L
        )
        select(table)
    }
    .execute()

tip

For one-to-one/many-to-one associations
(like Book.store in this example), even if the user does not declare an @IdView
property for the Book entity, the storeId property can still be used in SQL DSL.

Join to the associated object first, then access the id property

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;

List<Book> books = sqlClient
    .createQuery(book)
    .where(
        book
        .store()
        .id() // only access id
        .eq(2L)
    )
    .select(book)  
    .execute();

val books = sqlClient
    .createQuery(Book::class) {
        where(
            table
            .store
            .id // only access id
            eq 2L  
        )
        select(table)
    }
    .execute()

These two approaches are not equivalent:
1. where(table.storeId().eq(2L)): Simply filter based on the foreign key field of the current table.
2. where(table.store().id().eq(2L)): Actually join to the associated table first, then check the associated id value.
They are not equivalent because:
- The foreign key may be fake, i.e. there is no corresponding foreign key constraint in the database. This will cause non-null pseudo foreign keys to be unable to fetch non-null associated objects through joining.
- The associated object may be affected by global filters, so it should be ignored even if it exists in the database.
However, the two approaches are completely equivalent when all of the following conditions are met:
- The association is a one-to-one or many-to-one association based on foreign key, not through a join table or other reverse association mapping.
- The foreign key is real, i.e. there is a corresponding foreign key constraint in the database. See Real and Fake Foreign Keys.
- The associated object is not affected by global filters. Note this refers to either of:
  - User-defined filters
  - Logical deletion filters
In this case, the id of the associated object is actually just the foreign key of the current table, so they are equivalent.

info

Developers should understand the difference between these two approaches and choose the right one according to the business needs.

However, when the two approaches are equivalent, optimizing the second approach into the first one is very beneficial - this is called phantom join elimination.

The equivalence of the two query methods is a precondition for phantom join elimination to take effect, so this will not be repeated below.

Inapplicable Scenarios

Phantom join elimination is only applicable to accessing the id property of associated objects, not other properties.

Here, accessing the name property of the associated object is used as an example to demonstrate the inability to eliminate the join.

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;

List<Book> books = sqlClient
    .createQuery(book)
    .where(book.store().name().eq("MANNING")) 
    .select(book)
    .execute();

val books = sqlClient
    .createQuery(Book::class) {
        where(table.store.name eq "MANNING")
        select(table)
    }
    .execute()

The generated SQL:

select
    tb_1_.ID,
    tb_1_.NAME,
    tb_1_.EDITION,
    tb_1_.PRICE, 
    tb_1_.STORE_ID
from BOOK as tb_1_
    inner join BOOK_STORE as tb_2_
        on tb_1_.STORE_ID = tb_2_.ID
where
    tb_2_.NAME = ?

Applicable scenarios

If only the id property of the associated object is accessed, phantom join elimination can be triggered. For example:

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;

List<Book> books = sqlClient
    .createQuery(book)
    .where(
        book
        .store()
        .id() // Access id only
        .eq(2L)
    )
    .select(book)
    .execute(); 

val books = sqlClient
    .createQuery(Book::class) {
        where(
            table
            .store
            .id // Access id only
            eq 2L
        )
        select(table)
    }
    .execute()

This time, the generated SQL:

select
    tb_1_.ID,
    tb_1_.NAME,
    tb_1_.EDITION,
    tb_1_.PRICE,
    tb_1_.STORE_ID
from BOOK as tb_1_
where tb_1_.STORE_ID = ?  

We see no table join in the SQL, only a condition tb_1_.STORE_ID = ? based on the foreign key.

The reason: For a many-to-one association mapped by foreign key, the parent table's id is actually the child table's own foreign key.

Half Joins

Half joins are a concept similar to phantom joins, but for join table based associations.

Two ways to filter on associated ids

For join table based associations, there are two ways to filter on the associated id:

Check the foreign key field in the join table

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;

List<Book> books = sqlClient
    .createQuery(book)
    .where(
        book
        .asTableEx()
        .authorIds()
        .eq(2L)
    )
    .select(book)
    .execute();

val books = sqlClient
    .createQuery(Book::class) {
        where(
            table
            .asTableEx()
            .authorIds
            eq 2L
        )
        select(table)
    }
    .execute()

note

The asTableEx here has no substantial function, it will be introduced in the next doc Pagination Safety. Please ignore it for now.

caution

For one-to-many/many-to-many associations (like Book.authors in this example), *users must declare an @IdView property *for the Book entity to access the authorIds property in SQL DSL.

Join to the associated object first, then access the id property

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;

List<Book> books = sqlClient
    .createQuery(book)
    .where(
        book
        .asTableEx()  
        .authors()
        .id() // only access id
        .eq(2L)
    )
    .select(book)
    .execute();

val books = sqlClient
    .createQuery(Book::class) {
        where(
            table
            .asTableEx()
            .authors
            .id // only access id
            eq 2L
        )
        select(table)
    }
    .execute()

note

The asTableEx here has no substantial function, it will be introduced in the next doc Pagination Safety. Please ignore it for now.

These two approaches are not equivalent:
1. where(table.authorIds().eq(2L)): 1 join operation
  
  From the current table BOOK, join to BOOK_AUTHOR_MAPPING table and directly check the BOOK_AUTHOR_MAPPING.AUTHOR_ID field.
2. where(table.authors().id().eq(2L)): 2 join operations
  - Based on current table BOOK, join to BOOK_AUTHOR_MAPPING
  - Based on join table BOOK_AUTHOR_MAPPING, join to AUTHOR
  Finally, check the AUTHOR.ID field.
They are not equivalent because:
- The foreign key BOOK_AUTHOR_MAPPING.AUTHOR_ID may be fake, i.e. there is no corresponding foreign key constraint in the database. This will cause non-null pseudo foreign keys to be unable to fetch non-null associated objects through joining.
- The associated object may be affected by global filters, so it should be ignored even if it exists in the database.
However, the two approaches are completely equivalent when all of the following conditions are met:
- The foreign key in the join table is real, i.e. there is a corresponding foreign key constraint in the database. See Real and Fake Foreign Keys.
- The associated object is not affected by global filters. Note this refers to either of:
  - User-defined filters
  - Logical deletion filters
In this case, the id of the associated object is actually just the foreign key of the current table, so they are equivalent.

info

Developers should understand the difference between these two approaches and choose the right one according to the business needs.

However, when the two approaches are equivalent, optimizing the second approach into the first one is very beneficial - this is called half join elimination.

The equivalence of the two query methods is a precondition for half join elimination to take effect, so this will not be repeated below.

Inapplicable scenarios

Half join elimination is only applicable to accessing the id property of associated objects, not other properties.

Here, accessing the Author.firstName property of the associated object is used as an example to demonstrate the inability to eliminate the second join.

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;

List<Long> bookIds = sqlClient  
    .createQuery(book)
    .where(
        book
            .asTableEx()
            .authors()
            .firstName()
            .eq("Alex") 
    )
    .select(book.id())
    .distinct()
    .execute(); 

val bookIds = sqlClient
    .createQuery(Book::class) {
        where(
            table
                .asTableEx()
                .authors
                .firstName eq "Alex"
        )
        select(table.id)
    }
    .distinct() 
    .execute()

note

The asTableEx here has no substantial function, it will be introduced in the next doc Pagination Safety. Please ignore it for now.

The generated SQL:

select
    distinct tb_1_.ID
from BOOK as tb_1_  
inner join BOOK_AUTHOR_MAPPING as tb_2_
    on tb_1_.ID = tb_2_.BOOK_ID
inner join AUTHOR as tb_3_ 
    on tb_2_.AUTHOR_ID = tb_3_.ID 
where tb_3_.FIRST_NAME = ?

We see join tables produce two SQL JOIN clauses:

First step: Join to intermediate table inner join BOOK_AUTHOR_MAPPING as tb_2_ on tb_1_.ID = tb_2_.BOOK_ID
Second step: Join to target table inner join AUTHOR as tb_3_ on tb_2_.AUTHOR_ID = tb_3_.ID

Applicable scenarios

If only the id property of the associated object is accessed, half join elimination can be triggered. For example:

Java
Kotlin

BookTable book = Tables.BOOK_TABLE;

List<Long> bookIds = sqlClient
    .createQuery(book)
    .where(
        book
            .asTableEx()
            .authors()
            .id() // Access id only
            .eq(2L)
    )
    .select(book.id())
    .distinct()
    .execute();

val bookIds = sqlClient
    .createQuery(Book::class) {
        where(
            table
                .asTableEx()
                .authors
                .id // Access id only
                eq 2L
        )
        select(table.id) 
    }
    .distinct()
    .execute() 

note

The asTableEx here has no substantial function, it will be introduced in the next doc Pagination Safety. Please ignore it for now.

This time, the generated SQL:

select
    distinct tb_1_.ID
from BOOK as tb_1_   
inner join BOOK_AUTHOR_MAPPING as tb_2_
    on tb_1_.ID = tb_2_.BOOK_ID  
where tb_2_.AUTHOR_ID = ?

This time we only see one SQL JOIN clause instead of two.

The reason: The target table's primary key is actually the foreign key from the intermediate table to the target table.

Usage​

Basic Concepts​

Another Solution​

Phantom Joins​

Two ways to filter on associated ids​

Inapplicable Scenarios​

Applicable scenarios​

Half Joins​

Two ways to filter on associated ids​

Inapplicable scenarios​

Applicable scenarios​

Usage

Basic Concepts

Another Solution

Phantom Joins

Two ways to filter on associated ids

Inapplicable Scenarios

Applicable scenarios

Half Joins

Two ways to filter on associated ids

Inapplicable scenarios

Applicable scenarios