Amazon Athena not parsing cloudfront logs

Question:

I’m following the Athena getting started guide and trying to parse my own Cloudfront logs. However, the fields are not being parsed.

I used a small test file, as follows:

And created the table with this SQL:

But no data comes back:

athena screen shot with no data

I can see it returns 4 rows, but the first 2 should be excluded because they start with a #, so it’s like the regex isn’t being parsed correctly.

Am I doing something wrong? Or is the regex wrong (seems unlikely, as it’s in the docs, and looks fine to me)?

Answer:

This is what I ended up with:

Note the double backslashes are intentional.

The format of the cloudfront logs changed at some point to add the protocol. This handles older and newer files.

Leave a Reply