Skip to content

URL Pattern Matching

In order to make a handler function to run on specific websites, a url pattern parameter can be passed to @select() decorator. The url_match parameter should be valid Unix shell-style wildcards (see https://docs.python.org/3/library/fnmatch.html) or custom function that returns a boolean. The example below will only run if the URL of the current page matches *.com/*.

from dude import select


@select(css=".title", url_match="*.com/*")
def result_title(element):
    return {"title": element.text_content()}


@select(css="a.url", url_match=lambda x: x.endswith(".html"))
def result_url(element):
    return {"url": element.get_attribute("href")}

Examples

A more extensive example can be found at examples/url_pattern.py.