mount S3 to databricks

Question:

I’m trying understand how mount works. I have a S3 bucket named myB, and a folder in it called test. I did a mount using

My question is that: does it create a link between S3 myB and databricks, and would databricks access all the files include the files under test folder? (or if I do a mount using var AwsBucketName = "myB/test"does it only link databricks to that foldertestbut not anyother files that outside of that folder?)

If so, how do I say list files in test folder, read that file or or count() a csv file in scala? I did a display(dbutils.fs.ls("/mnt/myB")) and it only shows the test folder but not files in it. Quite new here. Many thanks for your help!

Answer:

From the Databricks documentation:

If you are unable to see files in your mounted directory it is possible that you have created a directory under /mnt that is not a link to the s3 bucket. If that is the case try deleting the directory (dbfs.fs.rm) and remounting using the above code sample. Note that you will need your AWS credentials (AccessKey and SecretKey above). If you don’t know them you will need to ask your AWS account admin for them.

Leave a Reply