๐งฉ Regular Expressions
Regular expressions are a powerful tool for various kinds of string manipulation.
Mastering this concept will significantly boost your Python data science skills!
๐ป Code Example:
import re text = "Pynfinity users: santoshtvk@pynfinity.com, dhruv@gmail.com, tvk@dev.io" # 1. re.findall โ extract all emails emails = re.findall(r"[\w.+-]+@[\w-]+\.[\w.]+", text) print("Emails:", emails) # 2. Named groups โ parse structured data log_line = "2026-04-21 17:31:05 ERROR utils.py:42 NullPointerException" pattern = (r"(?P<date>\d{4}-\d{2}-\d{2}) " r"(?P<time>[\d:]+) " r"(?P<level>\w+) " r"(?P<file>[\w.]+):(?P<line>\d+) " r"(?P<error>.+)") m = re.match(pattern, log_line) if m: print("\nLog parsed:") for k, v in m.groupdict().items(): print(f" {k:<8} : {v}") # 3. Lookahead / lookbehind prices = "โน999 โน1499 $20 โฌ15" inr_prices = re.findall(r"(?<=โน)\d+", prices) print("\nINR prices (numbers after โน):", inr_prices) # 4. re.sub with function โ mask emails def mask_email(m): user, domain = m.group().split("@") return user[0] + "***@" + domain masked = re.sub(r"[\w.+-]+@[\w-]+\.[\w.]+", mask_email, text) print("\nMasked:", masked) # 5. Verbose mode (readable patterns) phone_pattern = re.compile(r""" (?: (?:\+91|0)? # Optional country/trunk code [-\s]? # Optional separator )? (?:[6-9]\d{9}) # Indian mobile: starts 6-9, 10 digits """, re.VERBOSE) contact_str = "Call us: +91-9876543210 or 08012345678" phones = phone_pattern.findall(contact_str) print("\nPhones found:", phones) # 6. Compile for reuse in loops compiled_uuid = re.compile( r"[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}", re.IGNORECASE ) ids = "IDs: 56a5498a-4bbc-4609-9a94-14d69ecea624 and INVALID-ID" print("\nUUIDs:", compiled_uuid.findall(ids))
Keep exploring and happy coding! ๐ป