Accessing Amazon S3 from AWS Glue
A VPC endpoint for Amazon S3 enables AWS Glue to use private IP addresses to access Amazon S3 with no exposure to the public internet. AWS Gue does not require public IP addresses, and you don’t need an internet gateway, a NAT device, or a virtual private gateway in your VPS.
As you can see in the preceding image, you need
- VPC
- Router
- VPC endpoint for Amazon S3
1. VPC & Route table
Open the Amazon VPC console at assets/images/posts/202205/Test-Glue-Connection-1.png
Select the VPC dashboard and select one of the VPCs to be used by Glue connection.
You can see the VPC ID and Main route table. Remember the VPC ID and Main route table ID.
2. Create S3 Endpoint
Select Endpoints at the VPC console.
- Click Create endpoint
- Add a name tag
- Select AWS services for Service category
- Select S3 service and Gateway type
- Select VPC
- Select Route table
- Click Create endpoint to save
In the details of the created endpoint, you can see VPC ID and subnet IDs. You need VPC ID and subnet IDs when you create AWS Glue connections.
3. Create Glue connection
Go to AWS Glue console at https://console.aws.amazon.com/glue/home Select Connections and click Add connection.
- Add Connection name
-
Select Network for Connection type
- Select VPC and Subnet
- Select Security groups
- Confirm the settings by clicking Finish
4. Test connection
- Select the Glue connection
- Click Test connection
- Enter IAM role and S3 path
- Enter IAM role and S3 path
- Click Test connection
It will take a few moments. Just wait and see if it works.