Hello guys! I am so excited to write this article, In this tutorial, I present you how to find all links in a website using Python web scraping. Web scraping using python helps us to grab data from any website.
If you are a beginner to python web scraping here are my best tutorials on web scraping. Please check the below links.
oK, Alright! let’s dig into the main topic today. First things first, you need to install 2 required packages.
# pip install requests # pip install bs4
requests: It is used for accessing the website. To read more about requests read here Docs.
BS4: It is used for accessing the HTML elements of a webpage.
You can check all the things about BeautifulSoup here Docs.
Find all links in a website using python web scraping
Here I am giving the full code of this tutorial.
import requests from bs4 import BeautifulSoup url = 'https://devpyjp.com/' # Use your Website links res = requests.get(url).content soup = BeautifulSoup(res,'html.parser') links = soup.find_all('a') for link in links: print(link['href'])
we get the response using the requests library. we pass the response to create soup object. soup objects help us to access the HTML elements on a webpage.
Here, I used html.parser, you can use other parsers like XML also, please check the doc.
If you are familiar with HTML, all links are written in an anchor tag ( a tag ). So I find a tag in res using a beautiful soup object.
soup.find_all(‘a’) : Used to find a tag in res object.
In a tag, we wrote links href attribute. We use for loop to iterate all href in the links list.
we get all links on a website as output.
I think it will be useful to you in some way. If you like this tutorial please appreciate us and subscribe to our newsletter. Thanks for reading. Happy coding.