Hello everyone,
I have a technical question regarding the crawling of specific URLs on my website.
The situation:
The main URL of my website is, for example, https://homepage.de. However, during crawling, automatically generated URLs are also picked up, which I would like to exclude. One example of these unwanted URLs is: https://homepage.de/de/component/sppagebuilder/page/1.
What I’ve tried:
To prevent these URLs from being crawled, I added the entry Disallow: /components/ in my robots.txt file. Despite this, these links continue to be crawled.
My question:
Is there a reliable way to prevent these links from being crawled, even though they’re excluded in the robots.txt? Would any other methods or configurations be needed?
Thanks in advance for any help!