venkadesan elangovan
09/29/2022, 5:25 PMBarak Amar
09/29/2022, 5:55 PMserviceUrl
(point to your on-premiss lakefs) and pass your lakefs credentials to access the data you currently have on-premises from Azure Data Factory.
You will probably be required to open access to lakeFS (https termination, load-balancer and etc).
2. Pulling data directly from MinIO will require query metadata information from lakeFS to understand the which file is which. It will be easier to export the data from lakeFS in order to get them in the same layout you currently access them though lakeFS.
3. Hadoop connector is for HDFS (https://learn.microsoft.com/en-us/azure/data-factory/connector-hdfs?tabs=data-factory) which will not give you access to lakeFS. Option 1 is the connector you want to use.
But, when you wrote you would like to migrate to Azure - is the end result will be running lakeFS on Azure? or just export data from your local lakeFS?
If you want to copy the data, check https://docs.lakefs.io/reference/export.html
Depends on the side of the data or the way you want to sync it - I suggest looking into using rclone or using aws dist-cp.