i relatively new scrapy. running situations of pages not load properly. want retry task again 2 times ensure works correctly. note not 404 error fails while parsing result due missing element.
it happens few cases out of hundred , cannot reproduce passes next time retry. (verified capturing entire response body)
what way handle ?
i tried doing
def parse(self, response): try: #do yield result except: yield request(response.url, callback=self.parse)
but think these getting filtered , recognized duplicates scrapy. best way approach problem?
here how implemented solution.
def parse(self, response): meta = response.meta retries = meta.get(missing_ratings_retry_count, 0) if retries < max_retries: throw_on_failure = true else: throw_on_failure = false try: #do #use throw_on_failure variable thorw exception based on missing data response. yield result except specificexception: meta[missing_ratings_retry_count] = retries + 1 yield request(response.url, callback=self.parse, meta=meta, dont_filter=true)
Comments
Post a Comment