I have an issue which is ultimately about AWS sage...
# help
s
I have an issue which is ultimately about AWS sagemaker access to VPC resources rather than lakefs, but I figure some folk here might have worked through my scenario. I have a POC "open-source lakefs" installation in AWS. • A ec2 instance in a private VPC subnet hosts the lakefs server, and traffic to the server is managed by an application load balancer. • Currently I only allow traffic from my corporate vpn to reach the load balancer. With this setup I can access the lakefs web ui just fine and load datasets via huggingface
datasets
from my local work machine (provided I'm on the corporate vpn). ll • Most of my colleagues would need to access lakefs managed datasets via SageMaker Studio instances. I'm having difficulty determining how to update my lakefs application load balancer security group to allow traffic from these Studio instances. The underlying SageMaker domain has a "vpc only" configuration. Insofar as I can tell, the ENI injected by SageMaker studio can itself reach the underlying lakefs server, as I've allowed the application load balancer to accept traffic from the security group attached to the domain / ENI. However, the only way I've been able to get the studio instance to talk to the lakefs server thus far (via curl) is to explicitly allow inbound traffic from the studio instance's ip to the application load balancer. This is obviously not very sustainable. Any guidance here would be much appreciated!