Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Secrets Not Being Cleaned Up in Spark Operator & AKV2K8S Integration #2257

Open
gyurcse66 opened this issue Oct 16, 2024 · 0 comments
Open

Comments

@gyurcse66
Copy link

Description

We are encountering an issue where the Spark Operator and AKV2K8S are generating TLS secrets with every run of a ScheduledSparkApplication, but these secrets are not being cleaned up after the Spark driver completes. This results in hundreds of stale secrets being left behind, which affects resource usage and cluster performance.

We are specifically using the ScheduledSparkApplication for the job pyspark-pi-reporting, which is scheduled to run every 10 minutes. After each run, a new secret following the pattern akv2k8s-pyspark-pi-reporting-*-driver is created. These secrets contain fields like tls.key, tls.crt, and ca.crt, and are accumulating due to the lack of proper owner references or a cleanup mechanism.

  • [ 🆗 ] ✋ I have searched the open/closed issues and my issue is not listed.

Reproduction Code [Required]

Create a ScheduledSparkApplication that runs every 10 minutes in Kubernetes.
Configure AKV2K8S to handle secrets management using Azure Key Vault.
Observe that a new TLS secret is generated with each run and that these secrets are not deleted after the Spark driver finishes.

Steps to reproduce the behavior:

Expected behavior

I expect the generated secrets to be cleaned up automatically after the Spark driver pod is deleted. Proper owner references should be set so that Kubernetes can handle the garbage collection of these secrets.

Actual behavior

The secrets generated by the ScheduledSparkApplication are not being deleted after the Spark driver pod completes. They accumulate over time, leading to resource bloat in the cluster, which affects performance and necessitates manual secret deletion.

Terminal Output Screenshot(s)

Environment & Versions

Additional context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant