Group-name-version-lang1-lang2(JavaScript flavored) Regular
expressions are welcome!
Examples:
.*crawl matches both commoncrawl and paracrawl-eng$|-eng- matches all English datasets without a country code-eng(_US)?$|-eng(_US)?- matches all English US datasets-eng(_[A-Z]{2})?$|-eng(_[A-Z]{2})?- matches all English datasets, regardless of country
code