score:0

simply put, if you want to run spark locally (v2.4.x), you must specify hadoop as 2.6.5. you can use any version of the aws java sdk, but hadoop is specifically locked to that version. if you wish to circumvent this, it would be wise to upload files to s3 in one of 2 ways:

  • from your jar, using transfermanager
    • with v1.11.600, there is a bug that is causing it to use org.apache.httpcomponents.httpclient 4.5.8, which can cause logs to flood
  • from a bash script using aws s3 sync <-- recommended

Related Query

More Query from same tag