InvalidS3ObjectException: Unable to get object metadata from S3?

Question:

So I am trying to use Amazon Textract to read in multiple pdf files, with multiple pages using the StartDocumentTextDetection method as follows:

When just trying to retrieve the response object from s3, I’m able to see it printed out as:

Correspondingly, I’m using that s3_file.key to access the object later. But I’m getting the following error that I can’t figure out:

InvalidS3ObjectException: An error occurred (InvalidS3ObjectException) when calling the StartDocumentTextDetection operation: Unable to get object metadata from S3. Check object key, region and/or access permissions.

So far I have:

  1. Checked the region from boto3 session, both the bucket and aws configurations settings are set to us-east-2.
  2. Key cannot be wrong, I’m passing it directly from the object response
  3. Permissions wise, I checked the IAM console, and have it set to AmazonS3FullAccess and AmazonTextractFullAccess.

What could be going wrong here?

[EDIT] I did rename the files so that they didn’t have \\, but seems like it’s still not working, that’s odd..

Answer:

I ran into the same issue and solved it by specifying a region in extract client. In my case I used us-east2

The clue to do so came from this issue: https://github.com/aws/aws-sdk-js/issues/2714

Leave a Reply