Azure Data Lake is an excellent option for storing massive amounts of unstructured, semi-structured, and structured data in the cloud. One of the queries we often get pertaining to it is how to push files from a local storage to Azure Data Lake based on an event. Let me show you how it's done using JSCAPE MFT Server.
Would you prefer to watch a video showing how to automatically push files from local to Azure Data Lake based on an event? If so, you may play the video below. Otherwise, just skip it if you wish to continue reading.
I'm going to assume you already have your Data Lake storage shared folder ready and that you've already gathered the necessary connection details such as the Username, Password, Data Lake Tenant ID, and Data Lake Storage Account FQDN.
Once you have those, login to your JSCAPE MFT Server administrative web interface and go to the Trading Partners module. The first step is to create a Data Lake trading partner. To create one, just click the Add button.
Select Microsoft Azure Data Lake from the Protocol drop down list and the click OK.
Specify the Azure Data Lake trading partner parameters. Start by giving this trading partner a name, say, 'tp - azure data lake'. We'll use this name to refer to this trading partner in the JSCAPE MFT Server environment.
Now, enter the connection information we asked you to gather earlier.
You may test the connection by clicking the Test Server button. If the test succeeds click each succeeding OK buttons until you're back at the main screen.
Now that you have your trading partner ready, the next step is to create a trigger that would push files from a local drive to Azure Data Lake based on an event.
Go to the Triggers module and click the Add button.
You'll then be given the option to choose a trigger template that best suits your desired workflow. Let's just skip that part for now and just select the trigger elements ourselves. Click OK to proceed.
Give this trigger a name, say, 'push files to azure data lake'. Once you're done with that, you may then choose the event type that you want this trigger to respond to. Some of the commonly used event types include:
Current Time - this is a regular time-based event and is what you'll normally use if you want this trigger to fire at a pre-defined schedule.
Directory Monitor File Added - This event type works in conjunction with a directory monitor, which allows triggers to respond to certain directory-related events such as new file additions, file deletions, file changes, and file ageing. The 'Directory Monitor File Added' event, for example, lets this trigger respond if a new file is added to the monitored directory.
File Move - This event type allows this trigger to respond to any local file movement caused by another trigger that uses the Move File or Move Regex File trigger actions. In a way, you can use this trigger event type to monitor directory file additions without using a directory monitor.
If you scroll down that list, you'll that there are several other event types to choose from. For this example, let's just use the Current Time event type. So, this trigger will fire at a predefined scheduled.
We want this trigger to fire every Sunday at 8:30 PM, so we build the appropriate expression using the Expression Builder. For more information on how to use the Expression Builder, read the post 'Introducing the New Trigger Conditions Expression Builder'
After you click the Next button, you'll be brought to the Trigger Actions screen, where you can add the trigger action that would ultimately push the files in question from a specified local directory to Azure Data Lake based on the event you chose earlier.
Click Add to add a new trigger action.
For this example, we'll be using the Trading Partner Regex File Upload trigger action. This trigger action will upload files to a specified trading partner and will pick files to upload based on a regular expression or wildcard. Click OK to proceed.
Once you get to the trigger action parameters dialog, start by selecting the trading partner you created earlier in the Partner drop-down list. Next, specify the local directory from which files will be retrieved.
After that, select your preferred expression type. In this example, we'll just choose Wildcard and then enter *.* in the Regular Expression field. This will pick all files found in the specified local directory.
Next, specify the destination remote directory on Azure Data Lake where you want those files to be uploaded to. Here, I'm entering 'jscape/folder1', wherein jscape1 is a shared folder in my Azure Data Lake storage and folder1 is a subfolder underneath it.
Lastly, you may tick the Delete on Success to have the original files deleted after a successful upload.
Click OK when you're done.
Once you're back at the Trigger Actions screen, you'll see your newly added trigger action on the trigger actions canvas. Drag an arrow from the Start output of the Workflow node to the Execute input of the new trigger action node.
Click OK to finalize the trigger creation process.
That's it. Now you know how to configure JSCAPE MFT Server so it can push files from a local directory to Azure Data Lake based on an event.
Want to try this out yourself? Get started with JSCAPE when you request your exclusive free trial experience.