Hi Laurent, great post!

I have a question on VACUUM specifically around "By default, an implicit VACUUM is executed on a 7 days interval default value"

Does the above mean that a VACUUM command will be triggered behind the scenes somehow. I believe this is not the case but want to clarify. I know DLT has the built in automation to run OPTIMIZE/VACUUM or maybe both daily.

Ignoring DLT, let's assume we have a delta table lying on disk for a month or so. Unless we intentionally call VACUUM, the tombstone files will never be deleted. Not sure if read queries are designed to trigger a VACUUM in the background every now and then or what, guess no.

Thanks.

Yousry Mohamed
Yousry Mohamed

Written by Yousry Mohamed

Yousry is a principal data engineer working for Mantel Group. He is very passionate about all things data including Big Data, Machine Learning and AI.

Responses (1)