Hi team! I have some Databricks related questions,...
# help
j
Hi team! I have some Databricks related questions, hoping could guide me in the right direction: 1. What maven version should I use for
io.lakefs:hadoop-lakefs-assembly
if I am targeting Databricks runtimes 12.2.x/14.1.x/14.2.x ? 2. When launching ephemeral databricks jobs, are lakeFS access/secret keys mandatory, or can we grant access via IAM roles instead? We typically create ephemeral jobs every hour, and orchestrate them via Airflow. I want to avoid passing in those secrets/access keys in the spark configs as plain text. I am wondering if instance profiles would be an option (where we would grant permissions to talk to lakeFS server ahead of time). 3. If access/secret keys are mandatory, can we re-use the same keys for as many jobs as we need? 4. (more generic question) I noticed we can define branch protecting rules, but I don't see a way of specifying the specific rules (analogous to the Rules/Rulesets page of a Github repo). Does this mean a rule in lakeFS basically forbids direct write operations to that particular branch, and you can only update it via merge commits? At least that is what I gather from the How it works section of the docs, but wanted to double check. Thanks in advance!
a
@Julio Vilela I am Solutions Architect with lakeFS. Please see answers to your questions below. Ping me directly if you can’t disclose certain things on public channel. 1. Use latest version which is 0.2.4 currently. 2. You can use IAM Role. 3. Yes unless you want to track different jobs by different users for lineage or security purpose. 4. Yes, you are correct but using RBAC you can control who can merge or not.
j
@Amit Kesarwani thank you so much! this really helps
lakefs 1