在 Postgres 中不要使用 "Not In" ~ UWInfo Blog

select id from shopline_customer where id not in (select id from shopline_customer_ext)
以上 SQL 跑不完.

SELECT c.id
FROM shopline_customer c
WHERE NOT EXISTS (
SELECT 1
FROM shopline_customer_ext e
WHERE e.id = c.id
);

以上 SQL 只要 0.3 秒.

GPT 的解釋: (ps, id 是 primary key, 非 null)
🚨 一、為什麼會慢：NOT IN 的效能陷阱
NOT IN 會觸發「反半連接（anti-join）」掃描
PostgreSQL 必須逐筆比對左邊的每個 id
並檢查右邊的 id 是否不在其中
若右邊子查詢結果中出現任何 NULL，NOT IN 的邏輯會「不確定」，導致全表掃描與回退
即使 id 是主鍵（有索引），PostgreSQL 通常仍會展開為 Nested Loop Anti Join 或 Hash Anti Join，
當 shopline_customer 和 shopline_customer_ext 都是大表（例如數十萬筆），
時間就會暴漲。
⚠️ NOT IN + NULL 問題：
即使 id 欄位宣告為 NOT NULL，也可能因外層的隱含條件造成額外檢查負擔。
PostgreSQL 會多執行一次 Anti-Join NULL Filter，浪費時間。

*Nickname:
E-mail:
Blog URL:
*意見內容預覽 #Nickname# 2025/10/29 上午 10:24:11 #CommentContent#
*請輸入驗證碼:	看不懂,換張圖