abstract_api.web_scraping package#

Submodules#

Module contents#

class abstract_api.web_scraping.WebScraping(api_key: str | None = None)[source]#

Bases: BaseService[WebScrapingResponse]

AbstractAPI web scraping service.

Used to extract data from a given URL.

Attributes:

_subdomain: Web scraping service subdomain.

scrape(url: str, render_js: bool | None = None, block_ads: bool | None = None, proxy_country: str | None = None) WebScrapingResponse[source]#

Extracts data from the given URL.

Args:
url: The URL to extract the data from. Note that this parameter

should include the full HTTP Protocol (http:// or https://). If your URL has parameters, you should encode it. For example the & character would be encoded to %26.

render_js: If True the request will render Javascript on the

target site. Note that Javascript is rendered via a Google Chrome headless browser. Defaults to False.

block_ads: If True the request will block any advertisements it

can identify on the target site. Defaults to False.

proxy_country: The country to make the request from.

The country should be submitted in the two letter, ISO 3166-1 alpha-2 code.

Returns:

WebScrapingResponse representing API call response.

class abstract_api.web_scraping.WebScrapingResponse(response: Response)[source]#

Bases: FileResponse

Web scraping service response.