Using Regular Expressions To Download Files With Date Formats In Their Filenames
Overview
Some of the most common download-related questions people ask at our help desk are probably those that have to do with filenames that have dates on them. I encountered another one yesterday, so I thought it would be a good idea to write a blog post about the subject and build on it as I encounter different solutions along the way. Who knows? Some of you folks out there might find it useful. Better yet, you might be able to contribute your own expressions so that we can all benefit from them. Sounds like a plan? Let's get this ball rolling.
Still new to regular expressions? Check out these articles first:
Using Regular Expressions in Triggers - Part 1
Exploring Regular Expressions in DLP
The general problem is usually like this. The customer has a boatload of files on a remote server but he only wants to download those files whose filenames follow a certain format. Usually, that format involves date information. The usual solution in JSCAPE MFT Server (request a free trial) is a trigger action that allows regular expressions. These include:
- AFTP Regex File Download
- FTP Regex File Download
- FTPS Regex File Download
- SFTP Regex File Download
- Trading Partner Regex File Download
Choosing the right trigger action is the easy part. Composing the right regular expression? Not so easy. Here are some examples you can view for reference.
Download all AYYMMDD.txt
What we want to do here is to download all txt files starting with the letter A, followed by a year where only the last 2 digits are used, followed by a month expressed in two digits, and followed by a day of the month expressed in two digits. Here are some files that may match those requirements:
- A140321.txt
- A050218.txt
- A120709.txt
All other files should be ignored.
This regular expression should do the trick:
A[0-9][0-9](0[1-9]|1[012])(0[1-9]|[12][0-9]|3[01])\.TXT
That was nice. However, most of the folks I encountered didn't want just any date. They wanted that date to be today. Not to worry. JSCAPE MFT Server variables should do the trick.
Download all files with YYYYMMDD where the date should be today
So if today is February 18, 2015, then possible matches could include:
- A20150218.txt
- jscape20150218.txt
- mftdownload20150218xyz.txt
- mftdownload20150218xyz.xls
Here, you would need the Format function, the variables Year, Month, and DayOfMonth, and the .* regex characters:
.*Format("{0,number,00}{1,number,00}{2,number,00}",Year,Month,DayOfMonth)%.*
You can display a list of variables if you click the Add Variable button (see screenshot above).
The problem with the Format function is that it displays the year in 4 characters. While it will do as is for this particular example, it won't do for say the first example where the year is supposed to be expressed in only 2 digits.
Download all files with YYMMDD where the date should be today
Now, this one is a bit more complicated than the other two. Here's the solution (this is supposed to be a single line):
.*%Concat(Substring(Format("{0,number,00}",Year),2),Format("{0,number,00}{1,number,00}",Month,DayOfMonth))%.*
Notice that we used the Format function twice. The first one,
Format("{0,number,00}",Year)
obtains the year, while the second one,
Format("{0,number,00}{1,number,00}",Month,DayOfMonth)
obtains the month and day of the month.
Since the year is expressed in 4 digits, i.e., 2015, we had to use the Substring function
Substring(Format("{0,number,00}",Year),2)
to obtain only the last two digits. The 2 tells the substring function to extract a substring from the string "2015" starting with the character at index 2. Strings start at index 0, so the 1 in 2015 has index 2.
After the last 2 digits were extracted, the result was then concatenated with the string representing the month and day of month, i.e., "0218".
%Concat(Substring(Format("{0,number,00}",Year),2),Format("{0,number,00}{1,number,00}",Month,DayOfMonth))%
Thus, the concatenated string would be "150218".
Of course, if the remote server had the followin files:
- A20150218.txt
- jscape20150218.txt
- mftdownload20150218xyz.txt
- mftdownload20150218xyz.xls
Then all 4 files would still be downloaded because they would still match the regular expression. If you want to restrict the match to only those files where the desired date format is preceded by "A" (as in A150218.txt), then simply remove the first .* and replace it with "A" like so:
A%Concat(Substring(Format("{0,number,00}",Year),2),Format("{0,number,00}{1,number,00}",Month,DayOfMonth))%.*
You can display a list of functions if you click the Add Function button (see screenshot above).
Perhaps we can stop here for now. If you can share us your own regular expressions for downloading files with date formats, please feel free to tweet us with the hashtag #dateregex or look for this post on our social media accounts and reply there. You can of course comment below instead if you want.
Get Started
Build powerful automated file transfer processes using triggers and regular expressions on JSCAPE MFT Server. Request the free, fully-functional evaluation edition now.
Fill out this form to be eligible for your free trial.