Data collection using Python web scraping

Data collection using Python

Hello Guys..! This is an amazing tutorial, Trust me you loved it. What we do in this tutorial is data collection using python web scraping and store the data in JSON and CSV files. don’t you think is it amazing..?

What do you need

  • Basics of Python
  • Basics of Python Web scraping

That’s enough guys, even if you don’t know check my best articles on Web scraping using python here

Basics of Web scraping using Python

Find all links in a website using Python web scraping

Ok, Guys..! In this tutorial, we are targeting Flipkart. I am going to scrape mobile info data and save them into CSV and JSON files.

Fortunately, we have all class attributes in This HTML page for the price, mobile name, ratings, and reviews. so that we can easily grab those data from that source page.

I am giving my entire code for this Data collection using Python web scraping tutorial. You can copy and paste it on your python file and run on your cmd or terminal. It will create 2 files, one is JSON and another one is CSV.

Note: Please make sure your Internet connection.

To run this code, you need requests, BeautifulSoup and Pandas libraries which are used for scraping the data from websites, here from Flipkart. Pandas is used for creating CSV files.

JSON was already installed in your system when you install Python in your system, JSON used for to get the data in JSON format.

If you don’t have installed these libraries use these below commands on your cmd or Terminal.

pip install requests
pip install bs4
pip install pandas

Fkart.py

import requests
from bs4 import BeautifulSoup
import json
import pandas as pd

url = 'https://www.flipkart.com/search?q=mobiles'

res = requests.get(url).content

soup = BeautifulSoup(res,'html.parser')

titles = soup.find_all('div',class_='_3wU53n')
ratings  =soup.find_all('div',class_='hGSR34') 
reviews  =soup.find_all('span',class_='_38sUEc') 
prices  =soup.find_all('div',class_='_1vC4OE') 

mobiles = []
m_ratings =[]
m_reviews =[]
m_prices =[]



for title,rating,review,price in zip(titles,ratings,reviews,prices):
    #print(c, title.text,rating.text,review.text,price.text)
    mobiles.append(title.text)
    m_ratings.append(rating.text)
    m_reviews.append(review.text)
    m_prices.append(price.text)
    

# Exporting to CSV files

data = {'mobiles':mobiles,'ratings':m_ratings,'reviews':m_reviews,'prices':m_prices}

df = pd.DataFrame(data=data)

print(df.head())

#df.to_csv('mobile_data.csv',index=False)
#print('Success..!')


# Exporting to JSON
d = json.dumps(data)
print(d)

l = json.loads(d)

with open('mobile_data.json','w') as f:
    f.write(d)
    f.close()
    
print('Success..!')

After scraping the data we need to store the data for selling if you are lead generator or freelancer and for model building if you are a Data analyst or a Data scientist or Machine learning engineer.

In the above code i used two formats for data storing one is CSV and another one is JSON. If you are a web developer , you love JSON, I know that.

In this way, you can store your web scraped data in python, so that it would helpful for future thoughts.

If you like this tutorial and to get free notification please subscribe to our newsletter.

Thank you..!

2 thoughts on “Data collection using Python web scraping”

Leave a Reply