SCP03: Improper response URL join
What it does
Finds usage of urljoin() that can be replaced with
Response.urljoin().
Why is this bad?
uses the HTML
<base>tag to properly resolve relative URLs,doesn’t require an extra import, and
is more readable.
Example
import scrapy
from urllib.parse import urljoin
class MySpider(scrapy.Spider):
name = "myspider"
def parse(self, response):
yield {"homepage": urljoin(response.url, "/")}
Use instead:
import scrapy
class MySpider(scrapy.Spider):
name = "myspider"
def parse(self, response):
yield {"homepage": response.urljoin("/")}