Sone-162-javhd-today-04192024-javhd-today02-23-... -
# Extract source (e.g., JAVHD) if "JAVHD" in filename.upper(): features["source"] = "JAVHD"
# Extract movie ID (e.g., SONE-162) movie_match = re.search(r'([A-Z]+-\d+)', filename) if movie_match: features["movie_id"] = movie_match.group(1) SONE-162-JAVHD-TODAY-04192024-JAVHD-TODAY02-23-...
It looks like you're referencing a filename pattern from a JAV (Japanese Adult Video) source — possibly an MP4 file naming convention that includes a code (), a site label ( JAVHD ), and dates. # Extract source (e