SCP45: Unsafe meta copy

What it does

Reports the use of response.meta when creating a request.

Why is this bad?

response.meta is an alias to response.request.meta, and includes request metadata, set by components, that is specific to the corresponding request and should not be passed on to new requests.

For example, RetryMiddleware uses meta to keep track of how many times a request has been retried. If you pass response.meta to a new request, you will also pass the retry count, which will lower the number of times that the new request will be retried.

How to fix it?

Options include:

  • Use cb_kwargs.

    For example, instead of:

    def parse(self, response):
        return response.follow("/foo", self.parse2, meta={"foo": "bar"})
    
    
    def parse2(self, response):
        return response.follow("/bar", self.parse3, meta=response.meta)
    
    
    def parse3(self, response):
        foo = response.meta["foo"]
    

    Do:

    def parse(self, response):
        return response.follow("/foo", self.parse2, cb_kwargs={"foo": "bar"})
    
    
    def parse2(self, response, foo):
        return response.follow("/bar", self.parse3, cb_kwargs={"foo": foo})
    
    
    def parse3(self, response, foo): ...
    
  • If cb_kwargs feels too verbose, use the scrapy-sticky-meta-params plugin.

    For example, instead of:

    def parse(self, response):
        return response.follow("/foo", self.parse2, meta={"foo": "bar"})
    
    
    def parse2(self, response):
        return response.follow("/bar", self.parse3, meta=response.meta)
    
    
    def parse3(self, response):
        foo = response.meta["foo"]
    

    Configure the StickyMetaParamsMiddleware middleware, set sticky_meta_keys = ["foo"] in your spider class, and do:

    def parse(self, response):
        return response.follow("/foo", self.parse2, meta={"foo": "bar"})
    
    
    def parse2(self, response):
        return response.follow("/bar", self.parse3)
    
    
    def parse3(self, response):
        foo = response.meta["foo"]
    
  • Explicitly map the meta keys to pass along.

    For example, instead of:

    def parse(self, response):
        return response.follow("/foo", self.parse2, meta={"foo": "bar"})
    
    
    def parse2(self, response):
        return response.follow("/bar", self.parse3, meta=response.meta)
    
    
    def parse3(self, response):
        foo = response.meta["foo"]
    

    Do:

    def parse(self, response):
        return response.follow("/foo", self.parse2, meta={"foo": "bar"})
    
    
    def parse2(self, response):
        return response.follow("/bar", self.parse3, meta={"foo": response.meta["foo"]})
    
    
    def parse3(self, response):
        foo = response.meta["foo"]