We have seen a big issue with Spark job, which is, it writes its output files with part-nnnn naming due to its distributed behavior, and its not possible to rename it directly before writing, or modifying the underlying functions is not that easy.
Share this post
Renaming Spark Part-NNNN Files on S3
Share this post
We have seen a big issue with Spark job, which is, it writes its output files with part-nnnn naming due to its distributed behavior, and its not possible to rename it directly before writing, or modifying the underlying functions is not that easy.